Faculty Details

Photo of Daniel B. Neill

Daniel B. Neill

Associate Professor of Information Systems


Office: HBH 2105B
Voice: 412-268-3885
Email: neill@cs.cmu.edu
Personal Website

Biography

Daniel B. Neill is the Dean's Career Development Professor and Associate Professor of Information Systems at Carnegie Mellon University's H.J. Heinz III College, where he directs the Event and Pattern Detection Laboratory and the Joint Ph.D. Program in Machine Learning and Public Policy. He also holds courtesy appointments in the Machine Learning Department and Robotics Institute at CMU's School of Computer Science, and an adjunct appointment in the University of Pittsburgh's Department of Biomedical Informatics. He received his Ph.D. in Computer Science from CMU in 2006. Before that, he received his B.S.E. from Duke University, M.Phil. from Cambridge University, and M.S. from Carnegie Mellon.

Prof. Neill's research focuses on novel statistical and computational methods for discovery of emerging events and other relevant patterns in complex and massive datasets, applied to real-world policy problems ranging from medicine and public health to law enforcement and security. Application areas include disease surveillance (e.g., using electronically available public health data such as hospital visits and medication sales to automatically identify and characterize emerging outbreaks), law enforcement (e.g., detection and prediction of crime patterns using offense reports and 911 calls), health care (e.g., detecting anomalous patterns of care which significantly impact patient outcomes), and urban analytics (e.g., helping city governments to predict and proactively respond to emerging patterns of citizen needs). He has pioneered the use of "fast subset scan" methods to efficiently and accurately detect anomalous patterns in massive, complex datasets, as well as the use of "Bayesian spatial scan" approaches to detect and characterize events (such as disease outbreaks) in space-time data.

His research has been supported by the National Science Foundation, MacArthur Foundation, Richard King Mellon Foundation, and many others, and has been published in top journals such as the Journal of the Royal Statistical Society, Journal of Machine Learning Research, Machine Learning Journal, Journal of Computational and Graphical Statistics, Statistics in Medicine, and Big Data, along with top machine learning and data mining conferences such as NIPS, ICML, KDD, AISTATS, and ICDM. He has received best paper recognition from the Journal of Computational and Graphical Statistics and the National Syndromic Surveillance Conference. He currently serves as an advisor to the board of directors for the International Society for Disease Surveillance, and Associate Editor and "AI and Health" Department Editor of IEEE Intelligent Systems. Prof. Neill has served as scientific program chair of the International Society for Disease Surveillance Annual Conference and co-chaired the last two International Conferences on Smart Health. He is the recipient of a National Science Foundation CAREER award and was recently named one of the top ten "researchers to watch" in the artificial intelligence field.

Prof. Neill has been actively involved in curriculum development and teaching at the intersection of machine learning and public policy. He is the developer and coordinator of CMU's Joint Ph.D. Program in Machine Learning and Policy, jointly administered by the Machine Learning Department (School of Computer Science) and Heinz College. He has developed an introductory course in "Large Scale Data Analysis for Policy"(90-866) for the MSPPM program, a Ph.D. Research Seminar in Machine Learning and Policy (90-904/10-830), and a series of courses, "Special Topics in Machine Learning and Policy" (90-921/10-831), with topics including "Event and Pattern Detection", "Machine Learning for the Developing World", and "Harnessing the Wisdom of Crowds". He also teaches the core statistics course for the MISM program (95-796, "Statistics for IT Managers").

Please see Prof. Neill's personal website and the Event and Pattern Detection Laboratory website for the most up to date information about his current projects, publications, and other activities.

Publications

EVENT AND PATTERN DETECTION- SUBSET SCAN

Skyler Speakman, Sriram Somanchi, Edward McFowland III, and Daniel B. Neill. Penalized fast subset scanning. Journal of Computational and Graphical Statistics, 2016, in press. Selected for "Best of JCGS" invited session by the journal's editor in chief. (accepted author version).

Daniel B. Neill. Subset scanning for event and pattern detection. In S. Shekhar and H. Xiong, eds., Encyclopedia of GIS, 2nd ed., Springer, 2016, in press.

Skyler Speakman, Edward McFowland III, and Daniel B. Neill. Scalable detection of anomalous patterns with connectivity constraints. Journal of Computational and Graphical Statistics 24(4): 1014-1033, 2015. (pdf)

Edward McFowland III, Skyler Speakman, and Daniel B. Neill. Fast generalized subset scan for anomalous pattern detection. Journal of Machine Learning Research, 14: 1533-1561, 2013. (pdf)

Skyler Speakman, Yating Zhang, and Daniel B. Neill. Dynamic pattern detection with temporal consistency and connectivity constraints. Proc. 13th IEEE International Conference on Data Mining, 697-706, 2013. (pdf)

Daniel B. Neill, Edward McFowland III, and Huanian Zheng. Fast subset scan for multivariate event detection. Statistics in Medicine 32: 2185-2208, 2013. (pdf)

Daniel B. Neill. Fast subset scan for spatial pattern detection. Journal of the Royal Statistical Society (Series B: Statistical Methodology) 74(2): 337-360, 2012. (pdf)


EVENT AND PATTERN DETECTION- TWITTER EVENT DETECTION

Feng Chen and Daniel B. Neill. Human rights event detection from heterogeneous social media graphs. Big Data 3(1): 34-40, 2015. (pdf)

Feng Chen and Daniel B. Neill. Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs. Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 1166-1175, 2014. (pdf)


EVENT AND PATTERN DETECTION- BAYESIAN SCAN STATISTICS

Kan Shao, Yandong Liu, and Daniel B. Neill. A generalized fast subset sums framework for Bayesian event detection. Proceedings of the 11th IEEE International Conference on Data Mining, 617-625, 2011. (pdf)

Daniel B. Neill. Fast Bayesian scan statistics for multivariate event detection and visualization. Statistics in Medicine 30(5): 455-469, 2011. (pdf)

Daniel B. Neill and Gregory F. Cooper. A multivariate Bayesian scan statistic for early event detection and characterization. Machine Learning 79: 261-282, 2010. (pdf)

Daniel B. Neill, Gregory F. Cooper, Kaustav Das, Xia Jiang, and Jeff Schneider. Bayesian network scan statistics for multivariate pattern detection. In J. Glaz, V. Pozdnyakov, and S. Wallenstein, eds., Scan Statistics: Methods and Applications, 221-250, 2009. (pdf)

Maxim Makatchev and Daniel B. Neill. Learning outbreak regions in Bayesian spatial scan statistics. Proceedings of the ICML/UAI/COLT Workshop on Machine Learning for Health Care Applications, 2008. (pdf)

Daniel B. Neill, Andrew W. Moore, and Gregory F. Cooper. A Bayesian spatial scan statistic. In Y. Weiss, et al., eds. Advances in Neural Information Processing Systems 18, 1003-1010, 2006. (pdf)


EVENT AND PATTERN DETECTION- SPATIAL SCAN STATISTICS


Daniel Oliveira, Daniel B. Neill, James H. Garrett Jr., and Lucio Soibelman. Detection of patterns in water distribution pipe breakage using spatial scan statistics for point events in a physical network. Journal of Computing in Civil Engineering 25(1): 21-30, 2011. (pdf)

Daniel B. Neill. An empirical comparison of spatial scan statistics for outbreak detection. International Journal of Health Geographics 8: 20, 2009. (pdf) (open access)

Daniel B. Neill. Expectation-based scan statistics for monitoring spatial time series data. International Journal of Forecasting 25: 498-517, 2009. (pdf)

Daniel B. Neill, Andrew W. Moore, Maheshkumar Sabhnani, and Kenny Daniel. Detection of emerging space-time clusters. Proceedings of the 11th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 218-227, 2005. (pdf)

Daniel B. Neill and Andrew W. Moore. Anomalous spatial cluster detection. Proceedings of the KDD 2005 Workshop on Data Mining Methods for Anomaly Detection, 2005. (pdf)

Daniel B. Neill, Andrew W. Moore, Francisco Pereira, and Tom Mitchell. Detecting significant multidimensional spatial clusters. In L.K. Saul, et al., eds. Advances in Neural Information Processing Systems 17, 969-976, 2005. (pdf)

Daniel B. Neill and Andrew W. Moore. Rapid detection of significant spatial clusters. Proceedings of the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 256-265, 2004. (pdf)


EVENT AND PATTERN DETECTION- GENERAL

Daniel B. Neill and Weng-Keen Wong. A tutorial on event detection. Presented at the 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2009. (pdf)

Kaustav Das, Jeff Schneider, and Daniel B. Neill. Anomaly pattern detection in categorical datasets. Proceedings of the 14th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 169-176, 2008. (pdf)

Daniel B. Neill. Detection of spatial and spatio-temporal clusters. Ph.D. thesis, Carnegie Mellon University, Department of Computer Science, Technical Report CMU-CS-06-142, 2006. (pdf)


BAYESIAN NONPARAMETRICS / GAUSSIAN PROCESSES

Seth R. Flaxman, Daniel B. Neill, and Alexander J. Smola. Gaussian processes for independence tests with non-iid data in causal inference. ACM Transactions on Intelligent Systems and Technology, 2016, in press. (accepted author version)

William Herlands, Andrew Gordon Wilson, Hannes Nickisch, Seth Flaxman, Daniel B. Neill, Willem van Panhuis, and Eric P. Xing. Scalable Gaussian processes for characterizing multidimensional change surfaces. Proc. 19th International Conference on Artificial Intelligence and Statistics, 2016, in press.

Seth R. Flaxman, Andrew Gordon Wilson, Daniel B. Neill, Hannes Nickisch, and Alexander J. Smola. Fast Kronecker inference in Gaussian processes with non-Gaussian likelihoods. Proc. 32nd International Conference on Machine Learning, JMLR: W&CP 37, 2015. (pdf)


PUBLIC HEALTH / DISEASE SURVEILLANCE

Zachary Faigen, Lana Deyneka, Amy Ising, Daniel B. Neill, Mike Conway, Geoffrey Fairchild, Julia Gunn, David Swenson, Ian Painter, Lauren Johnson, Chris Kiley, Laura Streichert, and Howard Burkom. Cross-disciplinary consultancy to bridge public health technical needs and analytic developers: asyndromic surveillance use case. Online Journal of Public Health Informatics, 7(3):e228, 2015. (pdf)

Daniel B. Neill. New directions in artificial intelligence for public health surveillance. IEEE Intelligent Systems 27(1): 56-59, 2012. (pdf)

Xia Jiang, Gregory F. Cooper, and Daniel B. Neill. Generalized AMOC curves for evaluation and improvement of event surveillance. Proceedings of the American Medical Informatics Association Annual Symposium, 281-285, 2009. (pdf)

Maheshkumar R. Sabhnani, Daniel B. Neill, Andrew W. Moore, Fu-Chiang Tsui, Michael M. Wagner, and Jeremy U. Espino. Detecting anomalous patterns in pharmacy retail data. Proceedings of the KDD 2005 Workshop on Data Mining Methods for Anomaly Detection, 2005. (pdf)

M. Wagner, F.-C. Tsui, J. Espino, W. Hogan, J. Hutman, J. Hersh, D. Neill, A. Moore, G. Parks, C. Lewis, and R. Aller. A national retail data monitor for public health surveillance. Morbidity and Mortality Weekly Report 53: 40-42, 2004. (pdf)


HEALTH CARE INFORMATION SYSTEMS

Daniel Gartner, Rainer Kolisch, Daniel B. Neill, and Rema Padman. Machine learning approaches for early DRG classification and resource allocation. INFORMS Journal of Computing 27(4): 718-734, 2015. (pdf) (supplementary material)

Daniel B. Neill. Using artificial intelligence to improve hospital inpatient care. IEEE Intelligent Systems 28(2): 92-95, 2013. (pdf)

Sriram Somanchi and Daniel B. Neill. Discovering anomalous patterns in large digital pathology images. Proc. 8th INFORMS Workshop on Data Mining and Health Informatics, 2013. (pdf)

Christopher A. Harle, Daniel B. Neill, and Rema Padman. Information visualization for chronic disease risk assessment. IEEE Intelligent Systems 27(6): 81-85, 2012. (pdf)

Sharique Hasan, George T. Duncan, Daniel B. Neill, and Rema Padman. Automatic detection of omissions in medication lists. Journal of the American Medical Informatics Association 18(4): 449-458, 2011. (pdf)

Huanian Zheng, Rema Padman, Sharique Hasan, and Daniel B. Neill. A comparison of collaborative filtering methods for medication reconciliation. Proceedings of the 13th International Congress on Medical Informatics, 2010. (pdf)

Sharique Hasan, George T. Duncan, Daniel B. Neill, and Rema Padman. Towards a collaborative filtering approach to medication reconciliation. Proceedings of the American Medical Informatics Association Annual Symposium, 288-292, 2008. (pdf)

Christopher A. Harle, Daniel B. Neill, and Rema Padman. An information visualization approach to classification and assessment of diabetes risk in primary care. Proceedings of the 3rd INFORMS Workshop on Data Mining and Health Informatics, 2008. (pdf)


YOUTH VIOLENCE

Brad J. Bushman, Katherine Newman, Sandra L. Calvert, Geraldine Downey, Mark Dredze, Michael Gottfredson, Nina G. Jablonski, Ann S. Masten, Calvin Morrill, Daniel B. Neill, Daniel Romer, and Daniel W. Webster. Youth violence: what we know and what we need to know. American Psychologist 71(1): 17-39, 2016. (pdf) (APA press release)


GAME THEORY


Daniel B. Neill. Cascade effects in heterogeneous populations. Rationality and Society 17(2): 191-241, 2005. (pdf)

Daniel B. Neill. Evolutionary stability for large populations. Journal of Theoretical Biology 227(3): 397-401, 2004. (pdf)

Daniel B. Neill. Evolutionary dynamics with large aggregate shocks. Dept. of Computer Science, Technical Report CMU-CS-03-197, 2003. (pdf)

Daniel B. Neill. Cooperation and coordination in the Turn-Taking Dilemma. Proceedings of the Ninth Conference on Theoretical Aspects of Rationality and Knowledge: 231-244, 2003. (pdf)

Daniel B. Neill. Optimality under noise: higher memory strategies for the Alternating Prisoner's Dilemma. Journal of Theoretical Biology 211(2): 159-180, 2001. (pdf)


NATURAL LANGUAGE PROCESSING

Paul Hsiung, Andrew Moore, Daniel Neill, and Jeff Schneider. Alias detection in link data sets. Proceedings of the First International Conference on Intelligence Analysis, 2005. (pdf)

Daniel B. Neill. Fully automatic word sense induction by semantic clustering. Cambridge University, masters thesis, M.Phil. in Computer Speech, 2002. (pdf)
 

Research Interests


Machine learning, data mining, event detection, pattern detection, disease surveillance, crime prediction, urban analytics

Education


Ph.D., Computer Science, Carnegie Mellon University, 2006