Education
- Ph.D, Northwestern University, 2020 (2021 conferred)
- M.S., Penn State University, 2014 (2015 conferred)
- B.E., China University of Petroleum, 2012
Work experience
- Peking University (2024 - Now): Assistant Professor
- Yale University (2021 - 2023): Postdoctoral Researcher
- Northwestern University (2015 - 2020): Research and Teaching Assistant
- Penn State University (2012 - 2014): Research Fellow
- Industry: EQT(2014 - 2015)
Skills
- Analytical tools:
- Isotopes: IRMS, TIMS, MC-ICP-MS, LA-ICP-MS
- Elements: ICP-MS, ICP-OES, Coulometer
- Imaging: SEM, TEM, Raman Spectroscopy, Neutron Scattering
- Programming Languages:
- Object-oriented design using Java
- Web development using HTML, CSS, JavaScript
- Machine Learning and statistical analysis using R, Python
- Databases:
- Relational databases (MySQL)
- Non-relational database (MongoDB)
- Graph databases (neo4j)
Awards and leadership
- Agouron Geobiology Fellowship (2022)
- Sloss Award, Northwestern (2021)
- President, Chinese Student and Scholar Association at Northwestern (2016-2018)
- Contributor, Northwestern Inclusion and Diversity Report (2017/2018)
- Medill Science Writing Fellow, Northwestern (2018)
- Northwestern Outstanding Graduate Fellowship (2015)
- Graduate Research Fellowship, Geological Society of America (2014)
- Imperial Barrel Award, American Association of Petroleum Geologists (2014)
- Cheaspeake Award, Penn State (2013)
- Shell Energy Research Grant, Penn State (2013)
- National Scholarship of China (2011/2010)
Technical Projects (Big Data and Machine Learning)
- Deterministic Security Log Analysis and Visualization System (Paper accepted in ACM CSS 2020) (Instrumentation-free user-centric attack investigation system with high level semantics by JAVA)
- Helped to design system architecture and algorithms to extract information from system logs
- Designed data process application, leveraged Longest Common Subsequence (LCS) algorithm to extract common patterns from massive log (>100Gb), and generate ready-to-use JSON file
- Developed a website for users to upload collected logs (Java Spring MVC) and to visualize those logs and their semantics interactively (Google Catapult Trace-View JavaScript frontend)
- Implemented Google Catapult Trace-View to visualize recorded events and their semantics
- Built graph databases (neo4j) to store the graphical dependency of system events and optimized the data import time by >100 times
- Production Target Area Forecast, EQT Corporation (Applied scientist Intern – Pattern recognition and machine learning using R)
- Implemented principle component analysis, multiclass classification, and regression in R to analyze multi-dimensional Giga-bytes-data (1000+ wells) for shale-gas production
- Optimized the multivariable statistical model for exploration area prediction, reduced error by 72%
- Trained supervised machine learning models including Logistic Regression, Random Forest and KNearest Neighbors, and applied regularization with measured parameters to overcome overfitting
- Delivered 6 maps for production decision. Presented directly to the CEO and management team