Natural Language Process

NLP 4: Semantic Text Similarity and Topic Modeling

2021-07-27

Topic modeling is a useful tool for people to grasp a general picture of a long text document. Compared with LSTM or RNN, topic model is more or less for observatory purpose rather than prediction. In this post I will share the measure of similarity among words, the concept of topic modeling and its application in Python.

Read more
NLP 4: Semantic Text Similarity and Topic Modeling

NLP 3: Text Classification in Python

2021-07-22

In the previous two posts, I have shared basic concepts and useful functions of text mining and NLP. In this third post of text mining in Python, we finally proceed to the advanced part of text mining, that is, to build text classification model. In this post I will share the main tasks of text classification. Two useful classification models, their implementation in Python and methods of improving classification performance.

Read more
NLP 3: Text Classification in Python

NLP 2: NLTK Basics in Python

2021-07-10

In this post I will share what are the basic NLP tasks and how to deal with different tasks by using the powerful NLTK library in Python.

Read more
NLP 2: NLTK Basics in Python

NLP 1:Text Mining Application in Python (RegEx)

2021-06-15

In this exercise, we'll be working with messy medical data and using RegEx in Python to extract dates of different formats. The goal of this exercise is to correctly identify all of the different date variants encoded in this dataset and to properly standardize and sort the dates.

Read more
NLP 1:Text Mining Application in Python (RegEx)