Ruochi Zhang


  • University of Pittsburgh, US. M.S in Information Science 08/18-Present
    • Cumulative GPA: 3.92/4.0
  • Jilin University, China. Bachelor in Computer Science and Technology 09/14-06/18

Publications (* means co-first authors)

  • Ruochi Zhang, Ruixue Zhao, Fengfeng Zhou. pyHIVE, a Health-related Image Visualization and Engineering system using Python. _BMC Bioinformatics_, Nov. 2018 Download here
  • Ruixue Zhao*, Ruochi Zhang*, Fengfeng Zhou. TriZ-a rotation-tolerant image feature and its application in endoscope-based disease diagnosis. _Computers in Biology and Medicine_, Aug. 2018’ Download here
  • Fengxin, Ruochi Zhang, Fengfeng Zhou. An accurate regression of developmental stages for breast cancer based on transcriptomic biomarkers. Biomarkers in Medicine, Oct. 2018 Download here
  • Yuting Ye*, Ruochi Zhang*, Weiwei Zheng, Shuai Liu & Fengfeng Zhou. RIFS: A Randomly Restarted Incremental Feature Selection Algorithm. _Scientific Reports_ 2045-2322, Sept. 2017 Download here
  • Ruiquan Ge*, Guoqin Mai*, Ruochi Zhang*, Xundong Wu, Qing Wu, Fengfeng Zhou. MUSTv2: An Improved De Novo Detection Program for Recently Active Miniature Inverted Repeat Transposable Elements (MITEs). _Journal of Integrative Bioinformatics_, May 2017 Download here

Work Experience

Beijing XDstar Technology Co., Ltd. Beijing Jan 2018 - Aug 2018

Machine learning algorithm engineer

  1. Bank transaction anomaly detection
    • Abnormal account and abnormal transaction detection: Using Supervised learning, the data source is bank transaction, and whether the transactions are reported abnormal is label. After the pipeline of data cleaning, feature construction, Smote sampling, RandomForest or XGBoost training and prediction, giving account and transaction an anomaly scores;
    • Detect abnormal points of bank transaction data based on LSTM neural network model;
    • Based on neo4j, the knowledge graph is generated by the entities of tellers, customers, accounts, etc., relationships of transactions, operations, etc. detect money cycles and suspicious gang fraud.
  2. Log clustering and anomaly detection
    • Interact with ElasticSearch, load large-scale log data and write back model results;
    • Implement data cleaning, discretization, feature engineering. Use Clustering algorithms are used to cluster logs and perform anomaly analysis to generate security events;
    • Design a time series anomaly detection algorithm to find outliers in the Welfare Lottery log;
    • Use Kibana and ECharts to visualize the results and produce a visual report.
  3. Natural Language Processing Platform
    • Text extraction and cleaning, converting pdf, word, image, etc. into a unified txt format;
    • Topic mining based on LDA model;
    • Find text keywords, key phrases, key sentences and topic abstracts based on the TextRank model;
    • Use Chinese Wikipedia to train the word2vec model to calculate the similarity between laws and regulations with financial institutions’ self-regulation.

Shanghai Institute of Life Sciences Shanghai Apr 2017 - Oct 2017

Algorithm Research Assistant Shanghai

  1. Design dynamic network biomarker algorithms and single-sample dynamic network algorithms;
  2. Introduce parallelism to reduce the computation time of the algorithm to 10 percent of the original time on a 28-core Linux computer. GitHub address: https://github.com/zhangruochi/DNB


  • LSTM-based real-time stock forecasting GitHub
  • Heartbeat anomaly detection based on fitbit and google home GitHub GitHub
  • Movie Information Management System Github
  • LeetCode Notes Github
  • AML system GitHub Github


  • Professional Emphasis: Data Mining, Machine Learning, Deep Learning, Data Analysis Language: Python, Ruby, Java, C, Matlab, R, JavaScript, Shell
  • Framework: NumPy, Pandas, Scikit-Learn, MXNet, Keras, NetworkX, Matplotlib Database: MySql, Neo4j, ElasticSearch
  • Coursera Certificate: https://github.com/zhangruochi/Master-Computer-Science

Leave messages to me at here