Northeastern AIM Seminar Archive

Northeastern AIM Seminar Talks Archive 2012-2022

2020-2022

Note unusual day, time, and location:

Date: 3-4 pm, Thursday, March 30, 2032 in 105 Shillman and by Zoom (Hybrid)

Speaker: Stuart Brorson (Northeastern University)

Title: Anomaly Detection using Linear Algebra

Abstract: Detecting anomalous events in time series is an important new application for computers. For example, if a problematic squeak or a rumble emitted by an industrial motor could be caught and repaired early, potentially millions of dollars of repair costs may be avoided. I will outline a simple anomaly detection algorithm which uses the Fourier Transform and methods drawn from Linear Algebra. I will demonstrate the algorithm running on a Beaglebone single-board computer and some inexpensive electronics. This talk will be accessible to undergrads and anybody interested in applications of applied math.

Date: April 19, 2022 by Zoom

Speaker: Calina Copos (Biology & Math, NU)

Title: Network dynamics in cells: building connections across scales

Abstract: Processes at the microscale level can give rise to macroscopic phenomena. For example, the action of molecular-scale motors can move whole cells across a surface. There are a few sound approaches to study the rise of higher-order organization computationally including: coarse-grained molecular dynamics or Monte Carlo methods, heterogeneous multi-scale stochastic differential equations, or averaged partial differential equations when fluctuations are not a dominant process. In this work, we focus on the dynamics of the formation of the cellular structure as observed at the leading edge of motile eukaryotic cells. We construct a minimal agent-based model for the microscale dynamics, and a deterministic partial differential equation (PDE) model for the macroscopic network growth. We present designed metrics and machine learning approaches that connect phenomenological parameters in the reaction-diffusion system to the biochemical molecular rates typically measured experimentally.

Note unusual day and time:

Date: 4:45pm-5:45pm April 13, 2022 in West Village-G (WVG) 102 and by Zoom (Hybrid)

Speaker: Monika Pichler (Staff Data Scientist, Keystone Strategy)

Title: Fairness and Bias in Machine Learning

Abstract: Machine learning and artificial intelligence capabilities are evolving at a rapid pace and their applications span across many different disciplines, often involving sensitive data or decision making. Meanwhile, considerations of transparency and fairness of such systems are still a nascent field of study. This talk aims to highlight the importance of addressing fairness and bias in ML systems by discussing how bias can be introduced and presenting a case study of an unfair system, and to show some exciting recent research directions in this young field. We will also touch upon initiatives at Keystone Strategy to raise awareness of these issues and ensure they are front and center when designing ML systems for our clients.

Short bio: Monika Pichler received her PhD in mathematics and a certificate in Biotechnology at Northeastern University in 2019. She worked as a data scientist in the biotech field for 2.5 years and is now a staff data scientist at Keystone Strategy, a technology and economics consulting firm.

Date: April 12, 2022 by Zoom

Speaker: David Rosen (ECE & Math, NU)

Title: Certifiably Correct Machine Perception

Abstract: Many fundamental machine perception tasks require the solution of a high-dimensional nonconvex estimation problem; this class includes (for example) the fundamental problems of simultaneous localization and mapping (in robotics), 3D reconstruction (in computer vision), and sensor network localization (in distributed sensing), among others. Such problems are known to be computationally hard in general, with many local minima that can entrap the smooth local optimization methods commonly applied to solve them in practice. The result is that standard machine perception algorithms (based upon local optimization) can be surprisingly brittle, often returning egregiously wrong answers even when the problem to which they are applied is well-posed.

In this talk, we describe a novel class of certifiably correct estimation algorithms that are provably capable of efficiently recovering globally optimal solutions of generally-intractable machine perception problems in many practical settings. In brief, our approach directly tackles the problem of nonconvexity by employing convex relaxations whose minimizers provide exact, globally optimal solutions to the original estimation problem under moderate measurement noise. We illustrate the design of this class of methods using the fundamental problem of pose-graph optimization as a running example, culminating in the presentation of SE-Sync, the first practical method provably capable of recovering correct (globally optimal) solutions of the robotic mapping problem. We conclude with a discussion of open questions and future research directions.

Date: March 1, 2022 by Zoom

Speaker: Paul Hand (Math & Khoury, NU)

Title: Signal Recovery with Generative Priors

Abstract: Recovering images from very few measurements is an important task in imaging problems. Doing so requires assuming a model of what makes some images natural. Such a model is called an image prior. Classical priors such as sparsity have led to the speedup of Magnetic Resonance Imaging in certain cases. With the recent developments in machine learning, neural networks have been shown to provide efficient and effective priors for inverse problems arising in imaging. In this talk, we will discuss the use of neural network generative models for inverse problems in imaging. We will present a rigorous recovery guarantee at optimal sample complexity for compressed sensing and other inverse problems under a suitable random model. We will see that generative models enable an efficient algorithm for phase retrieval from generic measurements with optimal sample complexity. In contrast, no efficient algorithm is known for this problem in the case of sparsity priors. We will discuss strengths, weaknesses, and future opportunities of neural networks and generative models as image priors. These works are in collaboration with Vladislav Voroninski, Reinhard Heckel, Ali Ahmed, Wen Huang, Oscar Leong, Jorio Cocola, Muhammad Asim, and Max Daniels.

Date: February 15, 2022 by Zoom

Speaker: Jose Perea (Math, NU)

Title: The underlying topology of data

Abstract: Topology is the branch of mathematics concerned with shapes and their spatial properties. In this talk I’ll show how several ideas from classic algebraic topology — like cohomology, classifying spaces and vector bundles — can be used in machine learning tasks such as dimensionality reduction, time series analysis and data alignment.

2019

Date: October 8, 2019

Speaker: Jonathan Ullman (Khoury, NU)

Title: Preventing Overfitting in Adaptive Data Analysis

Abstract: How can we use a random sample to draw meaningful conclusions about populations, without overfitting to the sample? This problem remains vexing despite decades of study by statisticians. One cause of overfitting is so-called adaptive data analysis—the common practice where datasets are re-used for multiple analyses, and the outcome of one analysis can influence the choice of future analyses. Adaptivity can arise in a range of ways, including data shared across research groups, multi-stage statistical methods, and data science competitions with shared validation data. Unfortunately, adaptivity invalidates traditional methods for preventing overfitting because it breaks a fundamental assumptions about the independence of the data and the method.

In this talk I will introduce a relatively recent approach to understanding and preventing false discovery in adaptive data analysis based on the concept of algorithmic stability. Specifically, will introduce and motivate the problem of adaptive data analysis, describe a model for studying this problem, and demonstrate how overfitting can occur and, to some extent, be mitigated in this model.

Time & Date: 11:45 am, April 9, 2019 (Note unusual time!)

Speaker: Gabor Lippner (Math, NU)

Title: Quantum state transfer – using magnetic fields

Abstract: Lossless transmission of quantum information through a network of particles is an important task in quantum computing. Mathematically this amounts to studying solutions of the discrete Schrodinger equation d/dt phi = i H phi, where H is typically the adjacency or Laplace matrix of the graph. This in turn leads to questions about subtle number-theoretic behavior of the eigenvalues of H. It has proven to be difficult to find graphs which support such information transfer. I will talk about recent progress in understanding what happens when one is allowed to apply magnetic fields (that is, adding a diagonal matrix to H) to the system of particles.

Time & Date: 11:45 am, April 2, 2019 (Note unusual time!)

Speaker: Mario Sznaier (EECE, NU)

Title: Easy, hard or convex?: the role of sparsity and structure in systems theory

Abstract: Arguably, one of the hardest challenges faced now by the systems community stems from the exponential explosion in the availability of data, fueled by recent advances in sensing and actuation capabilities. Simply stated, classical techniques are ill equipped to handle very large volumes of (heterogeneous) data, due to poor scaling properties, and to impose the structural constraints required to implement ubiquitous sensing and control. For example, the powerful Linear Matrix Inequality framework developed in the past 20 years and associated semidefinite program based methods have proven very successful in providing global solutions to many control and identification problems. However, in may cases these methods break down when considering problems involving just a few hundred data points. On the other hand, several in-principle non-convex problems (e.g identification and robust control of classes of switched systems) can be efficiently solved in cases involving large amounts of data. Thus the traditional convex/non-convex dichotomy may fail to completely capture the intrinsic difficulty of some problems.

The goal of this talk is to explore how this “curse of dimensionality” can be potentially overcome by exploiting the twin “blessings” of self-similarity (high degree of spatio-temporal correlation in the data) and inherent underlying sparsity, and to answer the question of “what is Big Data in systems theory?” While these ideas have already been recently used in machine learning (for instance in the context of dimensionality reduction and variable selection), they have hitherto not been fully exploited in systems theory. By appealing to a deep connection to semi-algebraic optimization, rank minimization and matrix completion we will show that, in the context of systems theory, the limiting factor is given by the “memory” of the system rather than the size of the data itself, and discuss the implications of this fact. These concepts will be illustrated by examining examples of “easy” and “hard” problems, including identification and control of hybrid systems and (in)validation of switched models. We will conclude the talk by exploring the connection between hybrid systems identification, information extraction, and machine learning, and point out to new research directions in systems theory and in machine learning motivated by these problems.

Date: March 12, 2019

Speaker: Paul Hand (Math & Khoury, NU)

Title: Deep Decoder: Concise Image Representations from Untrained Networks

Abstract: Deep neural networks have become highly effective tools for compression and image recovery tasks. This success can be attributed in part to their ability to represent and generate natural images well. Contrary to classical tools such as wavelets, image-generating deep neural networks have a large number of parameters and need to be trained on large datasets. We will discuss an untrained simple image model, called the deep decoder, which is a deep neural network that can generate natural images from very few weight parameters. The deep decoder has a simple architecture fewer weight parameters than the output dimensionality. This underparameterization enables the deep decoder to compress images into a concise set of network weights. Further, underparameterization provides a barrier to overfitting, allowing the deep decoder to have state-of-the-art performance for denoising. The deep decoder’s simplicity makes the network amenable to theoretical analysis, and it sheds light on the aspects of neural networks that enable them to form effective signal representations.

Date: February 26, 2019

Speaker: Rose Yu (Khoury, NU)

Title: Fast and Interpretable Tensor Methods for Spatiotemporal Analysis

Abstract: Multivariate spatiotemporal data is ubiquitous in science and engineering, from sports analytics to neuroscience. Such data can be naturally represented as a multiway tensor. Tensor latent factor models provide a powerful tool for reducing the dimensionality and discovering the higher-order latent structures from data. However, existing tensor models are often slow or fail to yield latent factors that are easy to interpret by domain experts. In this talk, I will demonstrate advances in tensor methods to generate interpretable latent factors for high-dimensional spatiotemporal data. In particular, I will discuss: (1) a multiresolution tensor learning algorithm, that can leverage the multicale property of high-resolution spatial data, to speed up training and learn interpretable patterns; (2) a tensor latent feature learning algorithm that can learn binary representations of data that are both memory efficient and easy to interpret. We provide theoretical guarantees for our optimization algorithms and demonstrate their applications to real-world data from basketball plays and neuroscience.

2018

Date: November 27, 2018

Speaker: Sri Srinivas (Physics, NU)

Title: Data analytics for neurodynamics and quantitative neuroimaging

Abstract: Recent work in my lab on neurotechnology and MRI would benefit from collaborations with and insights from applied mathematicians. We are exploring the eye-brain connection using visual evoked potentials. We have developed a new portable system combining a smart phone headset with a scalp sensor that measures potentials and electric fields on the scalp. This system is now in clinical tests in patients with macular degeneration and amblyopia, as well as providing new insights into the neuro-visual pathways and processing in the human brain. Signal postprocessing analysis requires extensive de-noising approaches using wavelets and machine-learning algorithms (MLA). The response of the eye-brain system to transient checkerboard pattern reversal stimuli under photopic and scotopic conditions is providing some very interesting new results. While similar results and techniques are used for clinical diagnosis of a variety of neurological and ophthalmic disorders there is very little understanding of the measured response. A systematic quantitative characterization of the observed signals and relation to visual and brain processes could lead to improved diagnosis for eye and brain health. We have developed a new breakthrough method of Quantitative MRI called QUTE-CE MRI. We are currently running the first-in-human clinical trial at MGH and many more are planned. The method leads to angiograms of unprecedented clarity and definition. Furthermore it is quantitative, leading to the first absolute human cerebral blood volume maps, which can be correlated with neurological conditions. Mathematical tools that are useful here are network morphometry, classification and segmentation algorithms, and lesion identification using MLA.

Date: October 30, 2018

Speaker: Mike Malioutov (Math Dept)

Title: Statistician and Mathematician. Ronald Fisher and Andrey Kolmogorov: a distant strained relationship

Abstract: My talk aims to show how the two disciplines of Maths and Stats can enrich each other on an example of distant relations of two giants, R. Fisher and A. Kolmogorov. Both made revolutionary progress in theory and applications of their disciplines.

Date: October 16, 2018

Speaker: Huy Nguyen (CCIS, NU)

Title: Fast and parallel algorithms for submodular maximization

Abstract: Submodular functions naturally arise in a variety of contexts from economics to data mining and machine learning. Many optimization problems in these areas can be modeled as optimizing a submodular function subject to constraints. In this talk, we will survey some recent progress in algorithms for these problems in a variety of settings.

Date: 11:15 am -12:15 pm, October 9, 2018 (Note slightly delayed time)

Speaker: Byron Wallace (CCIS, NU)

Title: Training Neural NLP Models in Minimally Supervised Settings

Abstract: Modern neural models have achieved remarkably strong results in natural language processing (NLP) in recent years, achieving state-of-the-art performance across a range of domains and tasks. However, such models tend to be data-hungry, requiring large annotated training corpora to work well. This requirement impedes their use in specialized domains wherein direct supervision is expensive to collect, and hence sparse. I will discuss strategies for training such models with minimal labeled data. In particular these include strategies for active learning (AL) and approaches that exploit domain knowledge and other sources of indirect supervision. Finally, I will discuss approaches to efficient model transfer, including learning disentangled representations of texts.

Unusual Time and Date: 4:30-5:30 pm, September 27, 2018 (Joint with the Colloquium)

Speaker: Peter Bubenik (Dept of Math, University of Florida)

Title: Mathematical aspects of Topological Data Analysis

Abstract: Topological Data Analysis (TDA) is a new approach to analyzing complicated geometric and/or high dimensional data. The central mathematical object of TDA is the persistence module, which can be understood using the language of a number of different branches of mathematics. It turns out that taking an applied viewpoint on this algebraic object leads us to new mathematics. I will give an introduction to TDA focusing on its mathematical foundations, describing the pipeline of turning input data into persistence modules and then mapping to a Hilbert space, allowing one to use tools from statistics and machine learning. I will end with some generalizations and open problems.

Date: September 25, 2018

Speaker: Spencer Smith (Department of Physics, Mt Holyoke College)

Title: Topological Entropy Three Ways: new efficient algorithms for measuring fluid flow complexity

Abstract: Mixing in non-turbulent 2-dimensional fluid flows is due to the repeated stretching and folding of material lines. The exponential rate of increase in the length of these curves provides a fundamental measure of flow complexity. This presents an interesting challenge in the case where our knowlege of the fluid flow is through sparse data: Given a set of trajectories and an initial curve, find the minimum length curve that is compatible with the motion of these points. A very elegant algorithm by Moussafir, and elaborated on by Thiffeault, solves this by encoding the trajectories as braids, loops as algebraic coordinates (Dynnikov), and specifying the action of braids on loops. I will present two new algorithms which solve this same problem more efficiently. Both use ideas from computational geometry to maintain triangulations of the points as they move, while encoding curves as edge weights. These results also pave the way toward rapid detection of coherent structures in flows and provide a foot-hold to higher-dimensional versions of this problem.

Date: April 24, 2018

Speaker: Heikki Haario (Lappeenranta University of Technology, Department of Mathematics and Physics)

Title: Statistical invariance in chaos and random patterns

Abstract: We present a recent algorithm for parameter estimation of chaotic dynamical sysems. The idea is to consider simulated state vectors (or real measurements) as samples from the underlying attractor in the state space, and create a statistical feature vector to characterize the variability of it. With slight modifications, the approach can be applied to SDE systems, as well as to identify model parameters of reaction-diffusion systems by ensembles of Turing patterns, such as produced by, e.g, the Fitzhugh-Nagumo model, a classical model of excitable media.

Date: April 10, 2018

Speaker: Daniel Wichs (CCIS, Northeastern University)

Title: Encrypted Computation

Abstract: This talk will explore recent progress in cryptosystems that enable computation over encrypted data. We will discuss fully homomorphic encryption and signatures, as well as extensions to the multi-party setting.

Date: March 20, 2018

Speaker: Emanuele Viola (CCIS, Northeastern University)

Title: Interleaved group products

Abstract: Let G be the special linear group SL(2,q). We show that if (a1,a2) and (b1,b2) are sampled uniformly from large subsets A and B of G^2 then their interleaved product a1 b1 a2 b2 is nearly uniform over G. This extends a result of Gowers (2008) which corresponds to the independent case where A and B are product sets. We obtain a number of other results. For example, we show that if X is a probability distribution on G^m such that any two coordinates are uniform in G^2, then a pointwise product of s independent copies of X is nearly uniform in G^m, where s depends on m only. Similar statements can be made for other groups as well. (Results obtained with Timothy Gowers.) These results have applications in computer science, which is the area where they were first sought by Miles and Viola (2013).

Date: February 27, 2018

Speaker: Ehsan Elhamifar (CCIS and ECE, Northeastern University)

Title: Subset Selection and Summarization in Sequential Data

Abstract: The increasing amount of sequential data, including video, speech and text, across different areas of science requires robust and scalable techniques to reduce data redundancy. In this talk, I will discuss the problem of subset selection and summarization for sequential data and focus on the development of efficient optimization algorithms to tackle the problem. While best subset selection is, in general, an NP-hard problem, I will present efficient and robust algorithms based on convex and submodular optimization, that come with theoretical performance guarantees and can be applied to large-scale data and online observations. These methods, in contrast to classical approaches, can deal with general nonlinear data models, arbitrary similarity measures and data nuisances. I will discuss successful applications of these methods to summarization of instructional videos and to learning the grammar of complex tasks using the vast amount of data on the Internet.

Date: February 20, 2018

Speaker: Nicolai Panikov (Biotechnology Program, Northeastern University)

Title: Mathematical Modeling of Complex Microbial Dynamics in Bioreactors and in Nature

Abstract: The talk deals with using non-linear ODEs to simulate complex dynamic behavior of microbial populations, such as growth in batch and continuous culture, well-mixed homogeneous and spatially organized growth (biofilms, colonies), steady-state and transient processes, switching from one nutrient sources to another, adaptation to starvation and other stresses, super-fast and near-zero growth, etc. My review will start from simple unstructured models with 1-2 variables (e.g. cell mass and limiting nutrient concentrations) and then proceed step-by-step to more structured models including the genome-scale simulations operating with up to thousand variables. As expected, the best results were obtained with “fundamentally correct” models of intermediate complexity. I will discuss the strategy how to find the desired level of model complexity as dependent on goal/objectives of particular study.

Date: February 13, 2018

Speaker: Elina Robeva (Math Dept, MIT)

Title: Maximum Likelihood Density Estimation under Total Positivity

Abstract: Nonparametric density estimation is a challenging problem in theoretical statistics: in general the maximum likelihood estimate (MLE) does not even exist! Introducing shape constraints allows a path forward. This talk offers an invitation to non-parametric density estimation under total positivity (i.e. log-supermodularity) and log-concavity. Totally positive random variables are ubiquitous in real world data and possess appealing mathematical properties. Given i.i.d. samples from such a distribution, we prove that the maximum likelihood estimator under these shape constraints exists with probability one. We characterize the domain of the MLE and show that it is in general larger than the convex hull of the observations. If the observations are 2-dimensional or binary, we show that the logarithm of the MLE is a tent function (i.e. a piecewise linear function) with “poles” at the observations, and we show that a certain convex program can find it. In the general case the MLE is more complicated. We give necessary and sufficient conditions for a tent function to be concave and supermodular, which characterizes all the possible candidates for the MLE in the general case.

2017

Date: November 28, 2017

Speaker: Anand Osa (Courant Institute, NYU)

Title: Coarse-grained models for interacting flapping swimmers

Abstract: I will present the results of a theoretical investigation into the dynamics of interacting flapping swimmers. Our study is motivated by ongoing experiments in the Applied Math Lab at the Courant Institute, in which freely-translating, heaving hydrofoils interact hydrodynamically to choose their relative positions and velocities. We develop a discrete dynamical system in which flapping swimmers shed point vortices during each flapping cycle, which in turn exert forces on the swimmers. We present a framework for finding exact solutions to the evolution equations and for assessing their stability, giving physical insight into the preference for certain observed “schooling states”. The model may be extended to arrays of flapping swimmers, and to configurations in which the swimmers’ flapping frequencies are incommensurate. Generally, our results indicate how hydrodynamics may mediate schooling and flocking behavior in biological contexts.

Date: 4:00-5:00 pm, November 14, 2017 (Note unusual time!)

Place: 315 Behrakis (Note unusual place!)

Speaker: Stuart Brorson (Adjunct, Math Dept, Northeastern University)

Title: A new calculation of Feigenbaum’s constant alpha

Abstract: The logistic map is a widely-studied example of a dynamical system which behaves chaotically for certain parameter ranges. It is of particular interest because it displays a “period doubling” transition from order to chaos. In the late 1970s, Mitchell Feigenbaum showed that the period doubling behavior could be characterized by two numbers, delta and alpha. He also showed these numbers were universal in the sense that any map obeying fairly broad conditions would evidence the same values of delta and alpha. Therefore, delta and alpha are mathematical constants on the same footing as pi and e.

In my talk I will discuss where delta and alpha come from, and present a high-precision calculation (over 1000 digits) of alpha using the new computer language Julia. This represents the most digits of alpha ever calculated. I will talk about features in Julia which enable this computation. My talk will be of interest to all, and specifically accessible to undergraduates.

Note: This talk is jointly sponsored with the NU Math Club.

Date: November 7, 2017

Speaker: Christopher Riedl (CCIS and the D’Amore-McKim School of Business, Northeastern University)

Title: Optimal Experimental Design in Online Experiments

Abstract: We illustrate and extend a dynamic method to design and execute Bayesian optimal experiments. The method extends seminal work in statistics and computer science on Bayesian optimal design that have not taken hold due to computational limitations. Our procedure takes as primitives a class of parametric models of strategic behavior, a class of experimental designs, and priors on the behavioral parameters. The method then selects the experimental design that maximizes the information from the experiment and thus increases statistical power. We propose and evaluate two computational improvements to make the application of our approach more tractable. First, we develop a probabilistic bootstrapping procedure to reduce the number of hypothetical experimental outcomes that need to be considered. Second, we apply modern Artificial Intelligence methods to learn a multi-dimensional function to efficiently optimize the space of possible experiments. Together, these two improvements reduce computational complexity by several orders of magnitude over the naive approach while simultaneously increasing precision and reducing regret. This allows for a completely automated approach to designing experiments using only the space of possible experiments as input, which is critically important for the large scale experimentation performed by internet companies that run thousands of experiments daily.

Date: November 2, 2017.

Speaker: Suncica Canic (Department of Mathematics, University of Houston)

Special Talk: 12:00-1:30 pm, Interaction between fluids and structures motivated by real-life problems: micro-swimming soft robots, vascular stents, and heart valves.

Colloquium: 4:30-5:30 pm, Capturing nonlinear interaction between fluids and fiber-reinforced composite materials: a unified mathematical framework. More Info.

Date: 12:00-1:00 pm, October 17, 2017 (Note unusual time!)

Speaker: Michelle Borkin (CCIS, Northeastern University)

Title: Data Visualization Across Disciplines

Abstract: What can help enable both the treatment of heart disease and the discovery of newborn stars? Visualization. Specifically interdisciplinary data visualization, the sharing and co-development of tools and techniques across domains. In this talk I will share sample results from my own research and experience crossing disciplines and bringing together the knowledge and experts of computational physics, computer science, astrophysics, radiology and medicine. I will present new visualization techniques and tools inspired by this work for the astronomical and medical communities.

Date: April 18, 2017

Speaker: Michael Allshouse (MIE, Northeastern University)

Title: Application of Lagrangian coherent structures to oceanic transport

Abstract: Trajectory based analysis uncovers insight into the transport of fluid systems not apparent from Elarian data. Structures that dominate transport are revealed by these trajectories and are referred to as Lagrangian coherent structures. These structures have been used to elucidate transport in systems ranging in scale from blood flow to the dynamics of Jupiter’s Giant Red Spot. Coherent structures are used here to address problems of near-coastal ocean transport. We first present the theory behind these coherent structures. Then apply coherent structures to the surface transport of oil from spills and show that the fate of the oil can be dramatically affected by surface wind, which has not been included in previous studies. Another problem we analyze with coherent structures is the subsurface transport of cold, salty water from the abyss to the coastline. Much like waves on the ocean surface, density disturbances propagate through the stratified ocean as internal gravity waves with amplitudes of ~100m. As these internal waves approach the continental slope, they steepen and break resulting in foreign material being transported with the wave up the slope. We will show how biota and sediments can be trapped in the coherent structures formed by the nonlinear shoaling internal waves.

Date: April 11, 2017

Speaker: Chun-An (Joe) Chou (MIE, Northeastern University)

Title: Interpretable Medical Decision Support via Combinatorial Optimization

Abstract: Accurate medical diagnosis and prediction tools are particularly important to help one make better decisions on personalized treatment and intervention. Various predictive models/approaches were developed, and successfully achieved very high accuracy. However, most current tools built using machine learning techniques as a black box are not preferable because they fail to provide transparent information, e.g., what-if decision rules. In this talk, we present a combinatorial optimization approach to building accurate and interpretable decision models, driven by data, and demonstrate several practical cases compared to state-of-the-art machine learning methods.

Date: April 4, 2017

Speaker: Carina Curto (Penn State)

Title: What can topology tell us about the neural code?

Abstract: Cracking the neural code is one of the central challenges of neuroscience. Neural codes allow the brain to represent, process, and store information about the outside world. Unlike other types of codes, they must also reflect relationships between stimuli, such as proximity between locations in an environment. In this talk, I will explain why algebraic topology provides natural tools for understanding the structure and function of neural codes.

Date: March 28, 2017

Speaker: Mehdi Behroozi (MIE, Northeastern University)

Title: Asymptotic Analysis in Large-scale Optimization of Logistics and Network Problems: A Geometric Perspective

Abstract: Most problems in logistics, transportation, and network optimization are solved via linear and integer programming. However, most of these approaches are computational inefficient for large-scale problems. A geometric perspective can provide computational efficiency and at the same time bring new insights by looking at the input data differently and exploiting intrinsic geographic characteristics of these problems. In this talk we apply this new perspective to two practical transportation problems.

In the first problem, we study a fundamental trade-off in determining the net carbon footprint that results from widespread adoption of centralized delivery services. On the one hand, such services are efficient because they aggregate demand together and can plan efficient routes; on the other hand, a household that already undertakes a large amount of driving may have an “economy of scale” of its own because it is likely to be near a grocery store at some point anyway. We model this problem as a Generalized Travelling Salesman Problem (GTSP) and we perform a probabilistic analysis of the asymptotic behavior of the GTSP in order to showcase this trade-off and the efficiency of such services.

In the second problem, we consider a data-driven distributionally robust version of the Euclidean travelling salesman problem in which we compute the worst-case spatial distribution of demand against all distributions whose Wasserstein distance to an observed demand distribution is bounded from above. This constraint allows us to circumvent common overestimation that arises when other procedures are used, such as fixing the center of mass and the covariance matrix of the distribution. Numerical experiments confirm that our new approach is useful as a decision support tool for dividing a territory into service districts for a fleet of vehicle when limited data is available.

Date: March 21, 2017

Speaker: Benjamin Allen (Emmanuel College)

Title: Evolutionary dynamics on any population structure

Abstract: Evolution occurs in populations of reproducing individuals. The structure of a population can affect which traits evolve. Understanding evolutionary game dynamics in structured populations is difficult. Mathematical results have recently emerged for special structures where all individuals have the same number of neighbors. But the general case, where the number of neighbors can vary, has remained open. For arbitrary selection intensity, the problem is in a computational complexity class which suggests there is no efficient algorithm. Whether there exists a simple solution for weak selection was unanswered. Here we provide a solution for weak selection that applies to any graph or social network. Our method relies on calculating the coalescence times of random walks. We evaluate large numbers of diverse and heterogeneous population structures for their propensity to favor cooperation. We study how small changes in population structure—graph surgery—affect evolutionary outcomes. We find that cooperation flourishes most in societies that are based on strong pairwise ties.

Date: February 7, 2017 (CANCELLED)

Speaker: Michael Malioutov (Northeastern University)

Title: SCOT approximation and asymptotic inference

Abstract: Approximation of stationary strongly mixing processes by m-Markov models with sparse memory structure (SCOT) and the Le Cam-Hajek-Ibragimov-Khasminsky locally minimax theory of statistical inference for them is outlined. In our previous papers we proved SCOT equivalence to 1-MC with state space-alphabet consisting of the SCOT contexts. For a fixed alphabet size and growing sample size, the Local Asymptotic Normality is proved and applied for establishing asymptotically optimal inference. We outline what obstacles arise for a large SCOT alphabet size and not necessarily vast sample size.

Date: 12:00-1:00 pm, January 17, 2017 (Note unusual time!)

Speaker: Dmitri Krioukov (Northeastern University)

Title: Clustering Implies Geometry in Networks

Abstract: Two common features of many large real networks are that they are sparse and that they have strong clustering, i.e., large number of triangles homogeneously distributed across all nodes. In many growing real networks for which historical data is available, the average degree and clustering are roughly independent of the growing network size. Recently, (soft) random geometric graphs, also known as latent-space network models, with hyperbolic and de Sitter latent geometries have been used successfully to model these features of real networks, to predict missing and future links in them, and to study their navigability, with applications ranging from designing optimal routing in the Internet, to identification of the information-transmission skeleton in the human brain. Yet it remains unclear if latent-space models are indeed adequate models of real networks, as random graphs in these models may have structural properties that real networks do not have, or vice versa.

We show that the canonical maximum-entropy ensemble of random graphs in which the expected numbers of edges and triangles at every node are fixed to constants, are approximately soft random geometric graphs on the real line. The approximation is exact in the limit of standard random geometric graphs with a sharp connectivity threshold and strongest clustering. This result implies that a large number of triangles homogeneously distributed across all vertices is not only a necessary but also sufficient condition for the presence of a latent/effective metric space in large sparse networks. Strong clustering, ubiquitously observed in real networks, is thus a reflection of their latent geometry.

2016

Date: December 6, 2016

Speaker: Jose Angel Martinez-Lorenzo (ECE, Northeastern University)

Title: Next Generation Multi-Coded Compressive Systems for High-Capacity Sensing and Imaging Applications

Abstract: Millimeter-wave sensing and imaging systems are used ubiquitously for a wide range of applications, such as atmospheric sounding of the earth to forecast the weather, non-destructive-testing to assess the condition of civil infrastructure, deep space observation to explore the composition of the galaxies, security monitoring to detect potential threats in airport checkpoints, and biological imaging of superficial tissues for wound diagnosis and healing. These systems typically operate well when the scene does not change rapidly. Unfortunately, this is not the case in emerging societally-important applications like swarms of drones in rescue missions, smart self-driving cars on roadways, or cyber-physical systems searching for suicide bombers when they are on the move. These new applications require 4D (space + time) sensing and imaging at high video frame, as well as adapting the sensing process based on the evolution of the scene. The primary challenges of implementing 4D adaptable sensing and imaging system can be summarized as follows: 1) the system must be capable of handling variable dynamics, i.e. objects may be moving with different velocities and be located at different focal ranges; 2) the system must be capable of sampling sufficient signal to noise ratio data during the limited period of time available by the scene dynamics; and 3) the system must sample data at massive rates, of the order of several gigabytes per second, in order to perform fast 4D video reconstruction with high volumetric frame rate. With these challenges in mind, one of the key features of these 4D sensing and imaging systems will be the ability to extract the maximum amount of information (measured in bits), which describes properties of an object located in the imaging domain, from the data collected by the sensing system in a given period of time (seconds); this is the information rate (measured in bits per seconds). This information rate has an upper bound, which is the maximum amount of information that can be transferred by an electromagnetic wave from one region of space into another (the system’s physical capacity). This maximum information rate can only be approached when the mutual information of successive measurements is minimized. One way to achieve this is to merge traditional 1D temporal modulation with 3D dynamical modulation of the wavefield in space – a novel technique known as 3D spatial wave-field coding or spatial light modulation. 1D- temporal codes enable one to reach a 1D-information rate close to the upper bound imposed by the 1D-physical capacity, also known as Shannon capacity. Combining 3D-volumetric wavefield coding and 1-D-temporal codes should enable one to reach an information rate close to that imposed by the 4D-physical capacity.

This talk will cover the theoretical principles and fundamental limitations of adaptable compressive sensing and imaging systems using 4D coding. This coding can be implemented by novel Multi-Coded Compressive System using the following physical structures: spatial light modulators, vortex-meta-lenses, and compressive reflectors. Preliminary results will be presented in four domains: 1) multi-scale computational modeling; 2) system design and optimization; 3) real-time distributed imaging; and 4) hardware design integration and validation.

Date: November 8, 2016

Speaker: Zheyang Wu (Worcester Polytechnic Institute)

Title: Methods for Detecting Weak and Sparse Signals Among Correlated Features

Abstract: A group of optimal tests, such as the Higher Criticism test (HC), Berk-Jones test (B-J), $\phi$-divergence tests, etc., have been shown asymptotically optimal under the independence assumption. They were also shown theoretical advantages under dependence cases. However, due to the difficulty of p-value calculation for these tests under the dependence case, current applications are either based on de-correlation of input tests or empirical testing methods. We demonstrate that de-correlation is not an appropriate strategy; properly incorporated correlation information can help to improve statistical power. Meanwhile, we provide a solution to calculate the p-values under a broad range of correlation structures. Under stronger correlations our method is more accurate than the recently proposed GHC (the generalized Higher Criticism) method, which also targets at the correlated data problem. Moreover, our method applies to a wider family of goodness-of-fit (GOF) tests. This family covers the above-mentioned optimal tests, some of which are more powerful than GHC under various correlations.

Date: October 18, 2016

Speaker: Jian Zou (Worcester Polytechnic)

Title: Conquering Big Data in Volatility Inference and Risk Management

Abstract: The field of high-frequency finance has experienced a rapid evolvement over the past few decades. One focus point is volatility modeling and analysis for big data setting. It plays a major role in finance and economics. In this talk, we focus on the statistical inference problem on large volatility matrix using high-frequency financial data, and propose a methodology to tackle this problem under various settings. We illustrate the methodology with the high-frequency price data on stocks traded in New York Stock Exchange in 2013. The theory and numerical results show that our approach perform well while pooling together the strengths of regularization and estimation from a high-frequency finance perspective.

Date: September 27, 2016

Speaker: Ting Zhang (Boston University)

Title: Semiparametric Model Building for Regression Models with Time-Varying Parameters

Abstract: I consider the problem of semiparametric model building for linear regression models with potentially time-varying coefficients. By allowing the response variable and explanatory variables be jointly a nonstationary process, the proposed meth- ods are widely applicable to nonstationary and dependent observations, for example time-varying autoregressive processes with heteroscedastic errors. We propose a local linear shrinkage method that is capable of achieving variable selection and param- eter estimation simultaneously in a computationally efficient manner. Its selection consistency along with the favorable oracle property is established. Due to the fear of losing efficiency, an information criterion is further proposed for distinguishing time-varying and time-invariant components. Numerical examples are presented to illustrate the proposed methods.

Date: April 12 , 2016

Speaker: Edmund Yeh (ECE, Northeastern University)

Title: Throughput and Delay Scaling in Content-centric Wireless Networks

Abstract: We study the throughput and delay scaling of wireless networks based on a content-centric network architecture, where users are mainly interested in retrieving content stored in the network, rather than in maintaining source-destination communication. Nodes are assumed to be uniformly distributed in the network area. Each node has a limited-capacity content store, which it uses to cache contents according to a given caching scheme. Requested content follows a general popularity distribution, and users employ multihop communication to retrieve the requested content from the closest cache. We derive the throughput-delay tradeoff of the content-centric wireless network model. It is shown that in contrast to the source-destination-based communication model, the use of caching can simultaneously increase network throughput and decrease delay in the content-centric model. For the Zipf content popularity distribution, we explicitly find the optimal cache allocation to maximize throughput and minimize delay.

Next, we investigate heterogeneous wireless networks where, in addition to wireless nodes, there are a number of base stations uniformly distributed at random in the network area. Here, users retrieve the requested content from the closest wireless caching point, or from the closest base station (assumed to have access to all contents), whichever is nearer. We show that in order for a heterogeneous network to achieve better performance than an ad hoc network in the order sense, the number of base stations needs to be greater than the ratio of the number of nodes to the number of content types. Furthermore, we show that the heterogeneous network does not yield performance advantages in the order sense if the Zipf content popularity distribution exponent exceeds 3/2.

Joint work with Milad Mahdian

Date: March 15, 2016

Speaker: Usama Kadri (MIT)

Title: Acoustic-gravity waves, theory & applications

Abstract: The classical water-wave theory ignores the effects of water compressibility, on the grounds that acoustic waves are virtually decoupled from free-surface waves. In the linear theory, this assumption is well justified for many applications, as acoustic propagation modes possess vastly different spatial and/or temporal scales from free-surface waves due to the fact that the sound speed in water far exceeds the maximum surface wave phase speed. In keeping with the incompressible wave equation formulation, for describing waves in a fluid layer with a rigid bottom and a free-surface, any prescribed frequency corresponds to a single propagating gravity wave. However, accounting for both gravity and compressibility, then reveals a countable infinity of propagation modes, some of which are non-evanescent compression modes (do not decay with distance) that are called acoustic-gravity waves.

Acoustic-gravity waves are generated as a response to a sudden change of the water surface, e.g. due to tsunami or other sudden severe sea state. They travel at near the speed of sound in water (ca. 1500 m/s), which turns them into perfect precursors. Acoustic-gravity waves is an emerging field that is rapidly gaining popularity among the scientific community, as it finds broad utility in physical oceanography, marine biology, geophysics, and multiphase flows, to name a few. Acoustic-gravity waves are compression-type waves that can be generated by wind-wave and wave-wave interactions, movements of the tectonic lithosphere plates, landslides, and submarine explosions. They might be found in the oceans, lakes, rivers, atmosphere, as well as in hydraulic systems and flow pipelines.

This talk is an overview on acoustic-gravity waves, confined to three major projects I am currently conducting, and looking to study further. In particular, I will briefly discuss the application for early detection of tsunami, transportation of water in deep ocean, and then focus on the nonlinear triad resonance theory of acoustic-gravity waves, concluding with various implications.

Date: 12:00-1:00 pm, January 26, 2016 (Note unusual time)

Speaker: Wenjia Jing (Univ of Chicago)

Title: Stochastic homogenization and random fluctuation theory for partial differential equations

Abstract: Partial differential equations with oscillatory coefficients arise in many applications, such as the modeling of composite materials and flame propagation. The fine scale variations of the physical media are often unknown and are typically modeled as random. The theory of stochastic homogenization amounts to exploring the self-averaging mechanism of the PDEs, and it leads to mean field approximations of the large scale behavior of the solution. The theory of random fluctuations aims at studying the differences between the solution and its homogenization limit, leading to higher order, and often Gaussian, corrections to the approximation. In this talk, I will present some results on homogenization of the Hamilton-Jacobi equations and on the fluctuation theory for elliptic equations with random potentials.

Date: 12:00-1:00 pm, January 19, 2016 (Note unusual time)

Speaker: Francois Monard (University of Michigan)

Title: Toward imaging modalities with high(er) resolution

Abstract: Given an unknown physical quantity to be imaged, the extent to which an imaging approach (or the corresponding inverse problem) exploits at best the mathematics and the physics of the problem highly impacts (i) what can be reconstructed of this quantity, and (ii) the resolution available on the resulting images. Improvements in both directions (i) and (ii), of tremendous interest for, e.g., medical applications, can then be achieved by changing the underlying inverse problem into another one displaying better invertibility and conditioning.

This strategy for obtaining improvements will be illustrated in two ways, first by discussing some inverse problems for the Boltzmann transport equation, a model for optical tomography and SPECT. Second, we will review some coupled-physics (or hybrid) medical imaging modalities. Such modalities involve a coupling between a typically poorly-resolved soft tissue imaging modality (e.g., EIT, Elastography) and a highly-resolved one (e.g., Ultrasound, MRI), in order to derive imaging models displaying both high contrast and high resolution. In this context, the mathematical analysis of some models concerned with the reconstruction of conductivity and elasticity tensors has shown manifest improvements in both resolution and in the capability of accessing previously unavailable anisotropic features.

Date: 12:00-1:00 pm, January 12, 2016 (Note unusual time)

Speaker: Emanuel Lazar (Univ Penn)

Title: Topological Data Analysis: Structure in Spatial Point Sets

Abstract: Many physical systems can be abstracted as large sets of point-like particles. Understanding how such particles are arranged is thus a very natural problem, though describing this “structure” in an insightful yet tractable manner can be difficult. We consider a configuration space of local arrangements of neighboring points, and consider a stratification of this space via Voronoi topology. Theoretical results help explain limitations of purely metric solutions to this problem, and computational examples illustrate the unique effectiveness of topological ones. Applications to computational materials science are considered.

2015

Date: November 17, 2015

Speaker: Francois Monard (Mathematics, University of Michigan)

Title: X-ray transforms on surfaces and tensor tomography

Abstract: A family of integral geometric problems on surfaces consists of assessing what is reconstructible of a given integrand from knowledge of its integrals along all non-trapped geodesics, and how to reconstruct it. Such a class of problems includes the problem of reconstructing a function defined on the surface (with application to, e.g., computerized tomography in media with variable index of refraction), or a symmetric tensor field (in connection to other imaging methods, or arising as the linearized boundary rigidity problem on a manifold). In this talk we will discuss some inversion approaches to reconstruct functions on “simple” Riemannian surfaces, from knowledge of certain weighted geodesic ray transforms. These formulas will then be connected with recent explicit inversions of the tensor tomography problem derived by the author in the case of the Euclidean disk. In this setting, we will discuss in passing a full range characterization of the ray transform over tensors of any order, and in the case of functions, a new connection between a range characterization by L. Pestov and G. Uhlmann and the so-called Helgason-Ludwig consistency conditions for the Radon transform. Time allowing, we will go over recent results on inversion formulas for attenuated geodesic X-ray transforms on surfaces, and for unattenuated transforms on sufaces with hyperbolic trapped set (the latter in joint work with Colin Guillarmou), presenting examples where functions can be reconstructed from their ray transform even when the geometry contains trapped geodesics. Numerical illustrations will be presented throughout the talk.

Time & Date: 12:00-1:00 pm November 3, 2015 (Note unusual time!)

Speaker: Cris Moore (Santa Fe Institute)

Title: The hunt for a quantum algorithm for Graph Isomorphism

Abstract: After Shor discovered his quantum algorithm for factoring, many of us were hopeful that a similar algorithm could work for Graph Isomorphism. Our hopes were dashed — but dashed beautifully, by the representation theory of the symmetric group (to which I will give a friendly introduction). As a result, we now know that no quantum algorithm remotely similar to Shor’s will solve this problem.

This is joint work with several people, but I’ll focus on joint work with Alex Russell and Leonard Schulman.

Time & Date: Noon, Friday, October 30, 2015 (Note unusual date and time!)

Speaker: Ken Duffy (MIT Research Laboratory of Electronics)

Title: Guesswork, quantifying computational security

Abstract: The talk explores the connection among computational security, probability and information theory. The security of many systems is predicated on the following logic: a user selects a string, for example a password, from a list; an inquisitor who knows the list can query each string in turn until gaining access by chancing upon the user’s chosen string; the resulting system is deemed to be computationally secure so long as the list of strings is large.

Implicit in that definition is the assumption that the inquisitor knows nothing about the likely nature of the selected string. If instead one assumes that the inquisitor knows the probabilities with which strings are selected, then the random variable of interest is dubbed Guesswork, the number of queries required to identify a stochastically selected string, and the quantification of computational security becomes substantially more involved.

In this talk we review the seminal work of J. Massey (1994) and E. Arikan (1996) on the moments of Guesswork before describing some of our contributions, which start by establishing a Large Deviation Principle (LDP) for Guesswork, which enables direct estimates of the Guesswork distribution. Moreover, as the LDP is a covariant property, it facilitates a significant broadening of the remit of Guesswork to include the computational security of multi-user systems. These developments, as well as others related to information theoretic security, will be discussed. No prior knowledge of the subject will be assumed.

This talk is based on work with M. Christiansen (NUIM), F. du Pin Calmon (MIT) and M. Medard (MIT).

Date: October 27, 2015 – POSTPONED!

Speaker: Richard Jordan (MIT Lincoln Laboratory)

Title: Clusters and Communities in Air Traffic Delay Networks

Abstract: The air transportation system is a network of many interacting, capacity-constrained elements. When the demand for airport and airspace resources exceed the available capacities of these resources, delays occur. The state of the air transportation system at any time can be represented as a weighted directed graph in which the nodes correspond to airports, and the weight on each arc is the delay experienced by departures on that origin-destination pair. Over the course of any day, the state of the system progresses through a time series, where the state at any time-step is the weighted directed graph described above. This talk will describe approaches for the clustering of air traffic delay network data from the US National Airspace System, in order to identify characteristic delay states, as well as characteristic types of days. The similarity of delay states during clustering are evaluated on the basis of not only the in- and out-degrees of the nodes, but also network-theoretic properties such as the eigenvector centralities, and the hub and authority scores of different nodes. Finally, we’ll look at community detection, that is, the grouping of nodes (airports) based on their similarities within a system delay state. The type of day is found to have an impact on the observed community structures.

Joint work with Hamsa Balakrishnan, Karthik Gopalakrishnan and Jacob Avery, Department of Aeronautics & Astronautics, MIT

Date: October 13, 2015

Speaker: Sayan Mukherjee (Statistical Science, Duke University)

Title: Stochastic topology: random walks and percolation

Abstract: The graph Laplacian and random walks on graphs has impacted statistics, computer science, and mathematics. I will motivate why it is of interest to extend these graph based algorithms to simplicial complexes, which capture higher-order relations. I will describe recent efforts to define random walks on simplicial complexes with stationary distributions related to the combinatorial (Hodge) Laplacian. This work will touch on higher-order Cheeger inequalities, an extension of label propagation to edges or higher-order complexes, and a generalization of results for near linear time solutions for linear systems. Given n points down from a point process on a manifold, consider the random set which consists of the union of balls of radius r around the points. As n goes to infinity, r is sent to zero at varying rates. For this stochastic process, I will provide scaling limits and phase transitions on the counts of Betti numbers and critical points.