TY - GEN
T1 - Dlatk
T2 - 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2017
AU - Schwartz, H. Andrew
AU - Giorgi, Salvatore
AU - Sap, Maarten
AU - Crutchley, Patrick
AU - Eichstaedt, Johannes C.
AU - Ungar, Lyle
N1 - Publisher Copyright: © 2017 Association for Computational Linguistics.
PY - 2017
Y1 - 2017
N2 - We present Differential Language Analysis Toolkit (DLATK), an open-source python package and command-line tool developed for conducting social-scientific language analyses. While DLATK provides standard NLP pipeline steps such as tokenization or SVM-classification, its novel strengths lie in analyses useful for psychological, health, and social science: (1) incorporation of extra-linguistic structured information, (2) specified levels and units of analysis (e.g. document, user, community), (3) statistical metrics for continuous outcomes, and (4) robust, proven, and accurate pipelines for social-scientific prediction problems. DLATK integrates multiple popular packages (SKLearn, Mallet), enables interactive usage (Jupyter Notebooks), and generally follows object oriented principles to make it easy to tie in additional libraries or storage technologies.
AB - We present Differential Language Analysis Toolkit (DLATK), an open-source python package and command-line tool developed for conducting social-scientific language analyses. While DLATK provides standard NLP pipeline steps such as tokenization or SVM-classification, its novel strengths lie in analyses useful for psychological, health, and social science: (1) incorporation of extra-linguistic structured information, (2) specified levels and units of analysis (e.g. document, user, community), (3) statistical metrics for continuous outcomes, and (4) robust, proven, and accurate pipelines for social-scientific prediction problems. DLATK integrates multiple popular packages (SKLearn, Mallet), enables interactive usage (Jupyter Notebooks), and generally follows object oriented principles to make it easy to tie in additional libraries or storage technologies.
UR - https://www.scopus.com/pages/publications/85045035739
U2 - 10.18653/v1/d17-2010
DO - 10.18653/v1/d17-2010
M3 - Conference contribution
T3 - EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Proceedings
SP - 55
EP - 60
BT - EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing
PB - Association for Computational Linguistics (ACL)
Y2 - 9 September 2017 through 11 September 2017
ER -