About Me

I'm a MSc Data Science graduate from the University of Manchester(UK). My main focus is on interdisciplinary applications aimed at bridging the gap between satellite data and ecosystems using data science approaches. I aim to become a data scientist specializing in developing data-based, scientific, end-to-end solutions for biodiversity and sustainable supply chain management.

My Skills

  • Programming languages : Python/js
  • Computer vision(PyTorch & Keras)
  • Text mining task
  • Interactive website development(React)
  • SQL/MongoDB design and management
  • Scientific review and report writing

Some of my projects

Here I list some of the academic and non-academic projects that I am proud of(In no chronological order).

Master dissertation: deep learning analysis for volcanic plume detection using sentinel-5p data

Satellite data offers an alternative, exemplified by the European Space Agency’s Sentinel-5P satellite, which can measure atmospheric SO2 with unprecedented spa- tial resolution. Traditional methods for plume identification are labor-intensive and often impractical for global monitoring. This paper aims to tackle these challenges by developing an automated, transfer learning-based approach for global volcanic activity detection and plume identification.

Automatic Car Social Media Analysis: Sentiment Analysis and NER

Sentiment analysis and named entity recognition are fundamental components in social media analysis. In this project, I collected public's discussions about self-driving cars from Twitter from 2023-02-27 to 2023-03-09 based on a list of keywords/hashtags and applied various techniques including rule-based sentiment analysis, transformers, and support vector machines and compared the effect of these different approaches.

Supply Chain Traceability

Supply chain traceability is a crucial element of business practices and has gained increasing importance over the years, especially with the evolution of Industry 4.0 solutions. A mathematical model was developed to calculate the probability that a specific commercial product produced in one country was sourced from another also an interactive dashboard was made to display the results and trace the supply chain globally.

Question Classification: Text Mining

Question classification is a vital aspect of natural language processing (NLP) essential for developing automated question-answering systems. This project illustrated categorising questions into predefined types using two classifiers (Bag-of-words; Bilstm) with an NPL pipeline containing text parsing, pre-processing, word embedding and sentence representation.

Diabetes Forecasting: ML Approaches

With the bombing number of people with diabetes worldwide, diabetes is regarded as a 21st century challenge. If some risk factors could be used to predict susceptibility to diabetes effectively, this would help people prevent it and make a considerable difference in clinical practice. In this project, we worked on a Pima Indians Diabetes Database to predict whether a patient will get a diabetes or not.

Drawing Image Classification: Domain Adversarial Training

DaNN approaches build mappings between the source (training-time) and the target (test-time) domains. In this project, I was given 5000 32*32 RGB real images with labels(training) and 100000 28*28 gray scale drawing images without labels(test) as dataset and used domain adaptation technique to predict the drawing images.

Acme Corporation Task Cost Forecasting: ML Approaches

(Group Project) Minimizing costs and increasing profits are two principal means for businesses to achieve profitability. In this project, we specialized in developing a cost-minimization machine learning model to help Acme Corporation to identify the most cost-effective supplier for a given daily operating task.

Monopoly Game: SQL DataBase Design

Monopoly is a world-renowned economics-themed board game. Players roll dice to move around the game board, buying properties, paying rent and getting chances. In this project, I create a SQL database for Monopoly, which automatically updates by inserting one row in one table. Meanwhile, it provides a review of the previous move and gives each player a final score separately.

Eco-service Evaluation Model: Degradation-Factor Analysis

(EI Conference Paper) Environmental factors are often overlooked in land project development assessments, but their degradation can lead to much higher costs than benefits. In this project, I analysed the cost-benefit of small community and large-scale national projects and forecasted the long-term cost-benefit of land development projects, taking the time factor into consideration, using the neural network algorithm.

China-ASEAN Electromechanical Products: Social Network Analysis

(Project in Chinese) Digital economy has flourished under the stimulation of uncertainties such as the COVID-19 epidemic. And the scale of import and export of electronic products (e.g. integrated circuits) has been rising on both sides of China-ASEAN. In this project, social network analysis and QAP regression were used to analyse the changes in influence in the network from country to country, at different points in time(2009-2020).

JIZHI Human Resources Management System: Recommendation Algorithm

(Project in Chinese) Effective HRM practices are critical for SMEs to attract and retain top talent. This project involved the development of a website(using Vue.js and Spring Boot) that caters to the human resource management needs of small and medium enterprises (SMEs). The use of a recommendation algorithm streamlines the recruitment process, saving time and resources for SMEs, and increasing the chances of finding qualified candidates.

Yunnan Coffee: Re-utilising Resourses in Provincial Special Industries

(Project in Chinese) Yunnan province in China has been producing high-quality coffee for several years, but despite its potential, Yunnan coffee has struggled to penetrate the Chinese market. The purpose of this research is to investigate the reasons behind this problem. This research was supported by Coffee Association of Yunnan and Zhejiang University of Technology.

Contact Me

If you want to talk about some projects or academic/professional opportunity to work together you can contact me to my personal email shanyanfei2022@outlook.com.