About

Ehsaneddin Asgari

Since August 2014, I am the director of Life Language Processing project under the supervision of Professsor Mofrad at the Cell Biomechanics Laboratory at UC Berkeley.

Currently I am a data scientist at the NLP expert center of Data:Lab Munich and a post-doctoral researcher at Helmholtz Center for Infection Research.

My research interests are Language Processing, Bioinformatics, and Digital Humanities.

I completed my PhD at UC Berkeley in Applied Science and Technology with Designated Emphasis in Computational Data Science and Engineering.
My dissertation was on "Life Language Processing: Deep Learning-based Language-agnostic Processing of Proteomics, Genomics/Metagenomics, and Human Languages" supervised by Professor Mofrad in a close collaboration with the Deep NLP group at the University of Munich (LMU) (directed by Prof. Hinrich Schtze) as well as Helmholtz Center for Infection Research (directed by Prof. Alice McHardy).

I completed my second masters in Applied Science and Technology at UC Berkeley (2016) and my first masters in Computer Science (CS) at Swiss Federal Institute of Technology - Lausanne (EPFL). During my M.Sc. in CS, I majored in Natural Language Processing and completed my master's thesis at Computer Science and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology (MIT) under the supervision of Professor Mark Finlayson and Professor Patrick Winston.

I obtained my B.Sc. degree in Computer Engineering from Sharif University of Technology in 2011.

Research Interests

Computational Biology
Metagenomics
Computational Literature
Natural Language Processing
Digital Humanities
Deep Learning
Deep Proteomics
Distributional Semantics
Signal Processing
Bioinformatics
Machine Learning
Biomedical Signal Processing
Psychomining

Educations

University of California - Berkeley
PhD, August 2014 - August 2019.
Designated Emphasis in Computational and Data Science and Engineering
M.Sc. in Applied Science and Technology 2016.

Swiss Federal Institute of Technology - Lausanne (EPFL), Switzerland.
Masters in Computer Science, August 2011 - October 2014.

Sharif University of Technology,Iran, January 2008 - July 2011.
B.S. in Computer Engineering.

  • B.Sc. Thesis: Analytical Survey on Manifold-Based Semi-Supervised Learning.

Publications (Google Scholar)

Journal Papers

Khaledi A*, Weimann A*, Schniederjans M, Asgari E, Kuo T, Gabriel O, Kola A, Gastmeier P, Hogardt M, Jonas D, Mofrad MRK, Bremges A, McHardy AC, Hussler S.
Predicting antimicrobial resistance in Pseudomonas aeruginosa with machine learning?enabled molecular diagnostics
EMBO Molecular Medicine 2020 e10264 .

Zhou, N.et al. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens
Genome Biology 20, 244 (2019) doi:10.1186/s13059-019-1835-8.

E. Asgari, A.C. McHardy, M.R.K. Mofrad,
Probabilistic variable-length segmentation of protein sequences for discriminative motif discovery (DiMotif) and sequence embedding (ProtVecX).
Scientific Reports 9, 3577 (2019) doi:10.1038/s41598-019-38746-w.

E. Asgari, P. C. Muench, Till R. Lesker, A.C. McHardy *, M.R.K. Mofrad *,
DiTaxa: Nucleotide-pair encoding of 16S rRNA for host phenotype and biomarker detection.
Bioinformatics, bty954, 2018 (https://doi.org/10.1093/bioinformatics/bty954).

E. Asgari, K. Garakani, A.C. McHardy, M.R.K. Mofrad,
MicroPheno: Predicting environments and host phenotypes from 16S rRNA gene sequencing using a k-mer based representation of shallow sub-samples.
Bioinformatics, Volume 34, Issue 13, 1 July 2018, Pages i32i42.

Z. Jahed, D. Fadavi, U. Vu, E. Asgari, G. Luxton, M.R.K. Mofrad,
Molecular Insights into Mechanism of SUN1 Assembly in the Nuclear Envelope.
Biophysical Journal Volume 114, Issue 5, p11901203, 13 March 2018.

E. Asgari, M.R.K. Mofrad,
Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics.
PLoS ONE 10(11): e0141287 , 2015.

M. Neshati, D. Hiemstra, E. Asgari, H. Beigy,
Integration of scientific and social networks.
World Wide Web Journal Springer (WWW Journal) 2014, 1-29.

Book Chapters

E. Asgari and M. R. K Mofrad
Deep Genomics and Proteomics: Language Model-Based Embedding of Biological Sequences and Their Applications in Bioinformatics
Leveraging Biomedical and Healthcare Data (pp. 167-181) 2019. Academic Press.

H. Adel and E. Asgari and H. Schtze
Overview of Character-Based Models for Natural Language Processing
Lecture Notes in Computer Science book series (LNCS, volume 10761) 2017, 3--16.

Peer-reviewed Conference/Workshop Papers

E. Asgari, F. Braune, B. Roth, C. Ringlstetter, M. R.K. Mofrad,
UniSent: Universal Adaptable Sentiment Lexica for 1000+ Languages.
Accepted at LREC: Language Resources and Evaluation Conference 2020 , Marseille, France, May 2020. (long paper)

{Highlighted by MIT Tech Review Magazine} E. Asgari and H. Schtze,
Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages.
In Proceedings of the Empirical Methods on Natural Language Processing (EMNLP) , Copenhagen, Denmark, September 2017. (long paper)

{Best Paper Award supported by Google} E. Asgari, M.R.K. Mofrad,
Comparing Fifty Natural Languages and Twelve Genetic Languages Using Word Embedding Language Divergence (WELD) as a Quantitative Measure of Language Distance.
In Proceedings of the NAACL-HLT Workshop on Multilingual and Cross-lingual Methods in NLP , San Diego, CA, June 2016. (long paper)

{Best poster award supported by Google} E. Asgari, M.R.K. Mofrad,
Word Vectors of Biological Sequences and Their Applications in Bioinformatics.
International Machine Learning Conference (ICML) Workshop on Computational Biology, New York City, NY, June 2016. (short paper)

E. Asgari, S. Nasiriany, M.R.K. Mofrad,
Text Analysis and Automatic Triage of Posts in a Mental Health Forum.
In Proceedings of the NAACL-HLT Workshop on Computational Lingusitics and Clinical Psychology: From Linguistic Signal to Clinical Reality, San Diego, CA, June 2016. (short paper)

E. Asgari, M. Ghassemi, M. Finlayson,
Confirming the Themes and Interpretive Unity of Ghazal Poetry Using Topic Models.
In Proceedings of the Neural Information Processing Systems (NIPS) Workshop on Topic Models, Lake Tahoe, NV, December 2013. (short paper)

E. Asgari, J-C. Chappelier,
Linguistic Resources and Topic Models for the Analysis of Persian Poems.
In Proceedings of the NAACL-HLT Workshop on Computational Linguistics for Literature, pages 23-31, Atlanta, GA, June 2013. (long paper)

M. Neshati, E. Asgari, D. Hiemstra, H. Beigy,
A Joint Classification Method For Scientific and Social Network Integration.
In proceedings of the European Conference on Information Retrieval (ECIR), Moscow, Russia, March 2013. (long paper)

E. Asgari, J-C. Chappelier,
Analysis of Persian Poems with Computational Linguistics Tools.
In Proceedings of the Fifth International Conference on Iranian Linguistics (ICIL) , Bamberg, Germany, August 2013. (short paper)

Preprints

H. Schtze, H. Adel, and E. Asgari
Nonsymbolic Text Representation
preprint arXiv:1512.00397(2017).
{Originally was a short paper written by the first author at the EACL 2017}

E. Asgari and A. Sanaei
Measuring Countries Human Rights Positions in Universal Periodic Review
Available at SSRN, http://dx.doi.org/10.2139/ssrn.3029031.
{Also presented at the American Political Science Associationannual meeting 2017, San Fransico.}

Patent

Hamid R. Zarandi, A. Fattaholmanan, A.Vakilian, E. Asgari, Mohammad R. Besharati,
HodHod, Auto Hardware Description Generator.
IRI Patent 64177, March 2010.

Technical Reports

(Poster)

M.Abbaspour, E.Asgari, S.Bagheri, P.Khanipour, Minh N. Do, J.Lu, S.MahAbadi, A.Vakilian,
Automatic pill identification,
ADSC Technical Reports, Oct. 2010.

(Poster)

M.Abbaspour, E.Asgari, S.Bagheri, P.Khanipour, Minh N. Do, J.Lu, S.MahAbadi, A.Vakilian,
Indoor positioning and navigation with a camera phone,
ADSC Technical Reports, Oct. 2010.

Work Experiences

Graduate Student Instructor (August 2016- December 2016)
Introduction to Biomechanics: Analysis and Design
Department of Bioengineering, UC Berkeley, USA.

Graduate Student Instructor (June 2016- August 2016)
Data Structure and Algorithm
MIDS program, Information School (ISchool), UC Berkeley, USA.

Technical Reviewer of Master of Information and Data Science Program (January 2016- August 2016)
Admission Committee of Information School (ISchool), UC Berkeley, USA.

Graduate Student Instructor (January 2016 - June 2016)
Self-Paced Center of EECS Department, UC Berkeley, USA.

Intern Researcher in Deep Natural Language Processing (Summer 2014),
Supervised by Hinrich Schutze, The Center for Information and Language Processing, Ludwig Maximilian University of Munich, Germany.

Visiting Student: Master's Thesis (August 2013 - April 2014),
Supervised by Mark Finlayson and Patrick Winston, Computer Science and Artificial Intel- ligence Laboratory, Massachusetts Institute of Technology, Boston, USA.

Visiting Student: Intern Researcher (May 2013 - August 2013),
Supervised by Emery N Brown, Neuroscience Statistics Lab, Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Boston, USA.

Graduate Research Assistant, (December 2012 - May 2013),
Supervised by Martin Vetterli, Audiovisual Communications Laboratory in collaboration with Qualcomm, Swiss Federal Institute of Technology, Lausanne, Switzerland.

Intern: Data-mining for the Mining Industry, (Summer 2012),
Supervised by S. Gaulocher, ABB Corporate Research, Zurich, Switzerland.

Graduate Research Assistant, (February 2012 - June 2012),
Supervised by Pascal Frossard, Signal Processing Laboratory, Swiss Federal Institute of Technology, Lausanne (EPFL), Switzerland.

Intern, (Summer 2010),
Supervised by Minh Do, Advanced Digital Sciences Center of University of Illinois at Urbana-Champaign, Singapore.

Copyright 2018, by Ehsaneddin Asgari. All rights reserved