DiMotif: alignment free discriminative motif mining
DiMotif: we present DiMotif as an alignment-free discriminative motif miner and evaluate the method for finding protein motifs in different settings. The significant motifs extracted could reliably detect the integrins, integrin-binding, and biofilm formation-related proteins on a reserved set of sequences with high F1 scores. In addition, DiMotif could detect experimentally verified motifs related to nuclear localization signals.
The code is available at Github:
https://github.com/ehsanasgari/dimotif
The paper is under review, but available on bioArxiv and the software will be available on GitHub.
@article {Asgari345843,author = {Asgari, Ehsaneddin and McHardy, Alice and Mofrad, Mohammad R. K.},title = {Probabilistic variable-length segmentation of protein sequences for discriminative motif mining (DiMotif) and sequence embedding (ProtVecX)},year = {2018},doi = {10.1101/345843},publisher = {Cold Spring Harbor Laboratory},URL = {https://www.biorxiv.org/content/early/2018/07/12/345843},eprint = {https://www.biorxiv.org/content/early/2018/07/12/345843.full.pdf},journal = {bioRxiv}}