Deep Learning 60

LM, Language Model, Language Modeling, Conditional Probability, Statistical Language Model, n-gram

LM Language Model, is a model assigns probability to sequence in order to modeling language, in other words, finding the most natural word sequence Language Modeling Prediction to unknown word from given words. Conditional Probability It is the probability of an event occurring given that another event has already occurred. In this theory, mutually exclusive events are events that cannot occur s..

Deep Learning 2021.03.09

normalization, WordNetLemmatizer, PorterStemmer, LancasterStemmer, Storword

normalization Integrate different words to make them the same word-such as US is same as USA integrate them as US. 1. WordNetLemmatizer If words have different forms, find the root word-such as the root of 'am, are, is' is 'be'. from nltk.stem import WordNetLemmatizer lemmatizer=WordNetLemmatizer() words=[ 'have', 'going', 'loves', 'lives', 'flies', 'dies', 'watched', 'has', 'starting'] print('b..

Deep Learning 2021.03.05

gensim, Scikit-learn, NLTK, TreebankWordTokenizer, WordPunctTokenizer, sent_tokenize, pos_tag, word_tokenize, NLP, text_to_word_sequence, Corpus

Corpus Natural Language Data NLP - Natural Language Processing gensim - It is an open source library for unsupervised topic modeling and natural language processing, using modern statistical machine learning. Scikit-learn - SciPy Toolkit. It features various classification, regression and clustering algorithms including support vector machines. NLTK - The Natural Language ToolKit, is a suite of ..