【导读】机器学习领域顶尖学术会议——神经信息处理系统进展大会(Advances in NeuralInformation Processing Systems,NIPS),就是放在整个计算机科学界,也是数一数二的顶级学术会议。今年的NIPS将于 12 月份在美国长滩举行,本届NIPS共收到 3240 篇论文投稿,录用 678 篇,录用率为 20.9%;其中包括 40 篇口头报告论文和 112 篇 spotlight 论文。微软共中了16篇论文,其中微软亚洲研究院有4篇,Google有23篇。清华大学,今年共有6篇录用论文,包括张钹院士、王建民博士、鲁继文博士、朱军博士都有论文被录用;而北京大学有四篇论文被录用,中国科学院、中国科学技术大学、中国香港科技大学、中国香港中文大学及中国香港城市大学在内的多家高校也有多篇论文中了NIPS。
▌简介
机器学习领域顶尖学术会议——神经信息处理系统进展大会(Advances in Neural Information ProcessingSystems,NIPS),就是放在整个计算机科学界,也是数一数二的顶级学术会议。为鼓励跨学科研究,NIPS 惯例上 除录用机器学习方面的文章外,还会录用一部分神经科学方面的文章,有时甚至多达 1/3。与其他机器学习顶级会议(如国际机器学习会议 (ICML))相 比,NIPS 更偏向于神经网络和贝叶斯方法。但由于其神经科学方面的文章一般达不到相关领域重要期刊论文的水平,而其机器学习方面的文章则达到顶级水平,因此通常认为 NIPS 是一个机器学习方面 的顶级会议。NIPS 每篇投稿 文章都会收到大约 6 个审稿意见,其中 3 个为详细 (heavy) 意见,另外 3 个为简略 (light) 意见。详细意 见就是我们通常看到的审稿意见,包含了对文章优点和缺点较为详细的评论,而简略意见则只需要审 稿人针对文章给出一个简单总结。大会认为,简略意见虽然不具体,但其打分也可以为最后的录用决定提供参考。自 2013 年以来,NIPS 大会录用的文章在发表的同时,其审稿意见和作者的回复也将一 并在网上发表。
NIPS 2017 将于 12 月份在美国长滩举行,但从很早开始议论就没有停,尤其是围绕论文。本届NIPS共收到 3240 篇论文投稿,录用 678 篇,录用率为 20.9%;其中包括 40 篇口头报告论文和 112 篇 spotlight 论文。详细录用名单日前已经公布,可参见:https://nips.cc/Conferences/2017/AcceptedPapersInitial
微软共中了16篇论文,其中微软亚洲研究院有4篇。Google有23篇,包括之前备受关注的《Attention is All you Need》。Elon Mask投资的OpenAI有三篇。facebook6篇。
国内科研实力最强的清华大学,今年共有6篇录用论文,包括张钹院士、王建民博士、鲁继文博士、朱军博士都有论文被录用;而北京大学有四篇论文被录用。此外,包括中国科学院、中国科学技术大学、中国香港科技大学、中国香港中文大学及中国香港城市大学在内的多家高校也有多篇论文中了NIPS。
CMU 教授 Tuomas Sandholm和其博士生 Noam Brown 获得了 NIPS-17 最佳论文奖,获奖论文为《Safe and Nested Subgame Solving for Imperfect-Information Games》。
▌关键词统计信息:
专知进行关键词统计信息如下:
可以看出 相关学习理论,深度学习,神经网络,变分方法,高斯相关方法等等是投稿论文热点。
▌论文列表:
来源:
https://papers.nips.cc/book/advances-in-neural-information-processing-systems-30-2017
- Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence Learning
- Concentration of Multilinear Functions of the Ising Model with Applications to Network Data
- Deep Subspace Clustering Network
- Attentional Pooling for Action Recognition
- On the Consistency of Quick Shift
- Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization
- Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis
- Dilated Recurrent Neural Networks
- Hunt For The Unique, Stable, Sparse And Fast Feature Learning On Graphs
- Scalable Generalized Linear Bandits: Online Computation and Hashing
- Probabilistic Models for Integration Error in the Assessment of Functional Cardiac Models
- Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent
- Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning
- Interactive Submodular Bandit
- Scene Physics Acquisition via Visual De-animation
- Label Efficient Learning of Transferable Representations acrosss Domains and Tasks
- Decoding with Value Networks for Neural Machine Translation
- Parametric Simplex Method for Sparse Learning
- Group Sparse Additive Machine
- Uprooting and Rerooting Higher-order Graphical Models
- The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings
- From Parity to Preference: Learning with Cost-effective Notions of Fairness
- Inferring Generative Model Structure with Static Analysis
- Structured Embedding Models for Grouped Data
- A Linear-Time Kernel Goodness-of-Fit Test
- Cortical microcircuits as gated-recurrent neural networks
- k-Support and Ordered Weighted Sparsity for Overlapping Groups: Hardness and Algorithms
- A simple model of recognition and recall memory
- On Structured Prediction Theory with Calibrated Convex Surrogate Losses
- Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model
- MaskRNN: Instance Level Video Object Segmentation
- Gated Recurrent Convolution Neural Network for OCR
- Towards Accurate Binary Convolutional Neural Network
- Semi-Supervised Learning for Optical Flow with Generative Adversarial Networks
- Learning a Multi-View Stereo Machine
- Phase Transitions in the Pooled Data Problem
- Universal Style Transfer via Feature Transforms
- On the Model Shrinkage Effect of Gamma Process Edge Partition Models
- Pose Guided Person Image Generation
- Inference in Graphical Models via Semidefinite Programming Hierarchies
- Variable Importance Using Decision Trees
- Preventing Gradient Explosions in Gated Recurrent Units
- On the Power of Truncated SVD for General High-rank Matrix Estimation Problems
- f-GANs in an Information Geometric Nutshell
- Multimodal Image-to-Image Translation by Enforcing Bi-Cycle Consistency
- Mixture-Rank Matrix Approximation for Collaborative Filtering
- Non-monotone Continuous DR-submodular Maximization: Structure and Algorithms
- Learning with Average Top-k Loss
- Learning multiple visual domains with residual adapters
- Dykstra's Algorithm, ADMM, and Coordinate Descent: Connections, Insights, and Extensions
- Flat2Sphere: Learning Spherical Convolution for Fast Features from 360° Imagery
- 3D Shape Reconstruction by Modeling 2.5D Sketch
- Multimodal Learning and Reasoning for Visual Question Answering
- Adversarial Surrogate Losses for Ordinal Regression
- Hypothesis Transfer Learning via Transformation Functions
- Adversarial Invariant Feature Learning
- Convergence Analysis of Two-layer Neural Networks with ReLU Activation
- Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization
- Langevin Dynamics with Continuous Tempering for Training Deep Neural Networks
- Efficient Online Linear Optimization with Approximation Algorithms
- Geometric Descent Method for Convex Composite Minimization
- Diffusion Approximations for Online Principal Component Estimation and Global Convergence
- Avoiding Discrimination through Causal Reasoning
- Nonparametric Online Regression while Learning the Metric
- Recycling for Fairness: Learning with Conditional Distribution Matching Constraints
- Safe and Nested Subgame Solving for Imperfect-Information Games
- Unsupervised Image-to-Image Translation Networks
- Coded Distributed Computing for Inverse Problems
- A Screening Rule for l1-Regularized Ising Model Estimation
- Improved Dynamic Regret for Non-degeneracy Functions
- Learning Efficient Object Detection Models with Knowledge Distillation
- One-Sided Unsupervised Domain Mapping
- Deep Mean-Shift Priors for Image Restoration
- Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees
- A New Theory for Nonconvex Matrix Completion
- Robust Hypothesis Test for Functional Effect with Gaussian Processes
- Lower bounds on the robustness to adversarial perturbations
- Minimizing a Submodular Function from Samples
- Introspective Classification with Convolutional Nets
- Label Distribution Learning Forests
- Unsupervised object learning from dense equivariant image labelling
- Compression-aware Training of Deep Neural Networks
- Multiscale Semi-Markov Dynamics for Intracortical Brain-Computer Interfaces
- PredRNN: Recurrent Neural Networks for Video Prediction using Spatiotemporal LSTMs
- Detrended Partial Cross Correlation for Brain Connectivity Analysis
- Contrastive Learning for Image Captioning
- Safe Model-based Reinforcement Learning with Stability Guarantees
- Online multiclass boosting
- Matching on Balanced Nonlinear Representations for Treatment Effects Estimation
- Learning Overcomplete HMMs
- GP CaKe: Effective brain connectivity with causal kernels
- Decoupling "when to update" from "how to update"
- Self-Normalizing Neural Networks
- Learning to Pivot with Adversarial Networks
- MolecuLeNet: A continuous-filter convolutional neural network for modeling quantum interactions
- Active Bias: Training a More Accurate Neural Network by Emphasizing High Variance Samples
- Differentiable Learning of Submodular Functions
- Inductive Representation Learning on Large Graphs
- Subset Selection for Sequential Data
- Question Asking as Program Generation
- Revisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces
- Gradient Descent Can Take Exponential Time to Escape Saddle Points
- Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction
- One-Shot Imitation Learning
- Learning the Morphology of Brain Signals Using Alpha-Stable Convolutional Sparse Coding
- Integration Methods and Optimization Algorithms
- Sharpness, Restart and Acceleration
- Learning Koopman Invariant Subspaces for Dynamic Mode Decomposition
- Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations
- Learning spatiotemporal piecewise-geodesic trajectories from longitudinal manifold-valued data
- Improving Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms and Its Applications
- Predictive-State Decoders: Encoding the Future into Recurrent Networks
- Posterior sampling for reinforcement learning: worst-case regret bounds
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results
- Matching neural paths: transfer from recognition to correspondence search
- Linearly constrained Gaussian processes
- Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data
- Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets
- Learning to Inpaint for Image Compression
- Adaptive Bayesian Sampling with Monte Carlo EM
- No More Fixed Penalty Parameter in ADMM: Faster Convergence with New Adaptive Penalization
- Shape and Material from Sound
- Flexible statistical inference for mechanistic models of neural dynamics
- Online Prediction with Selfish Experts
- Tensor Biclustering
- DPSCREEN: Dynamic Personalized Screening
- Learning Unknown Markov Decision Processes: A Thompson Sampling Approach
- Testing and Learning on Distributions with Symmetric Noise Invariance
- A Dirichlet Mixture Model of Hawkes Processes for Event Sequence Clustering
- Deanonymization in the Bitcoin P2P Network
- Accelerated consensus via Min-Sum Splitting
- Generalized Linear Model Regression under Distance-to-set Penalties
- Adaptive sampling for a population of neurons
- Nonbacktracking Bounds on the Influence in Independent Cascade Models
- Learning with Feature Evolvable Streams
- Online Convex Optimization with Stochastic Constraints
- Max-Margin Invariant Features from Transformed Unlabelled Data
- Cognitive Impairment Prediction in Alzheimer’s Disease with Regularized Modal Regression
- Translation Synchronization via Truncated Least Squares
- From which world is your graph
- A New Alternating Direction Method for Linear Programming
- Regret Analysis for Continuous Dueling Bandit
- Best Response Regression
- TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning
- Learning Affinity via Spatial Propagation Networks
- Linear regression without correspondence
- NeuralFDR: Learning Discovery Thresholds from Hypothesis Features
- Cost efficient gradient boosting
- Probabilistic Rule Realization and Selection
- Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions
- A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis
- Learning Multiple Tasks with Deep Relationship Networks
- Deep Hyperalignment
- Online to Offline Conversions and Adaptive Minibatch Sizes
- Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure
- Deep Learning with Topological Signatures
- Predicting User Activity Level In Point Process Models With Mass Transport Equation
- Submultiplicative Glivenko-Cantelli and Uniform Convergence of Revenues
- Deep Dynamic Poisson Factorization Model
- Positive-Unlabeled Learning with Non-Negative Risk Estimator
- Optimal Sample Complexity of M-wise Data for Top-K Ranking
- What-If Reasoning using Counterfactual Gaussian Processes
- Communication-Efficient Stochastic Gradient Descent, with Applications to Neural Networks
- On the Convergence of Block Coordinate Descent in Training DNNs with Tikhonov Regularization
- Train longer, generalize better: closing the generalization gap in large batch training of neural networks
- Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks
- Model evidence from nonequilibrium simulations
- Minimal Exploration in Structured Stochastic Bandits
- Learned D-AMP: Principled Neural-network-based Compressive Image Recovery
- Deliberation Networks: Sequence Generation Beyond One-Pass Decoding
- Adaptive Clustering through Semidefinite Programming
- Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning
- Repeated Inverse Reinforcement Learning
- The Numerics of GANs
- Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search
- Learning Chordal Markov Networks via Branch and Bound
- Revenue Optimization with Approximate Bid Predictions
- Solving (Almost) all Systems of Random Quadratic Equations
- Unsupervised Learning of Disentangled Latent Representations from Sequential Data
- Lookahead Bayesian Optimization with Inequality Constraints
- Hierarchical Methods of Moments
- Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts
- Revisit Fuzzy Neural Network: Demystifying Batch Normalization and ReLU with Generalized Hamming Network
- Speeding Up Latent Variable Gaussian Graphical Model Estimation via Nonconvex Optimization
- Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models
- Generating steganographic images via adversarial training
- Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
- PixelGAN Autoencoders
- Consistent Multitask Learning with Nonlinear Output Relations
- Fast Alternating Minimization Algorithms for Dictionary Learning
- Learning ReLUs via Gradient Descent
- Stabilizing Training of Generative Adversarial Networks through Regularization
- Expectation Propagation with Stochastic Kinetic Model in Complex Interaction Systems
- Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs
- Compatible Reward Inverse Reinforcement Learning
- First-Order Adaptive Sample Size Methods to Reduce Complexity of Empirical Risk Minimization
- Hiding Images in Plain Sight: Deep Steganography
- Neural Program Meta-Induction
- Bayesian Dyadic Trees and Histograms for Regression
- A graph-theoretic approach to multitasking
- Consistent Robust Regression
- Natural value approximators: learning when to trust past estimates
- Bandits Dueling on Partially Ordered Sets
- Elementary Symmetric Polynomials for Optimal Experimental Design
- Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols
- Backprop without Learning Rates Through Coin Betting
- Pixels to Graphs by Associative Embedding
- Runtime Neural Pruning
- Compressing the Gram Matrix for Learning Neural Networks in Polynomial Time
- MMD GAN: Towards Deeper Understanding of Moment Matching Network
- The Reversible Residual Network: Backpropagation Without Storing Activations
- Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe
- Zap Q-Learning
- Expectation Propagation for t-Exponential Family Using Q-Algebra
- Few-Shot Learning Through an Information Retrieval Lens
- Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation
- Associative Embedding: End-to-End Learning for Joint Detection and Grouping
- Practical Locally Private Heavy Hitters
- Large-Scale Quadratically Constrained Quadratic Program via Low-Discrepancy Sequences
- Inhomogoenous Hypergraph Clustering with Applications
- Differentiable Learning of Logical Rules for Knowledge Base Reasoning
- Deep Multi-task Gaussian Processes for Survival Analysis with Competing Risks
- Masked Autoregressive Flow for Density Estimation
- Non-convex Finite-Sum Optimization Via SCSG Methods
- Beyond normality: Learning sparse probabilistic graphical models in the non-Gaussian setting
- Inner-loop free ADMM using Auxiliary Deep Neural Networks
- OnACID: Online Analysis of Calcium Imaging Data in Real Time
- Collaborative PAC Learning
- Fast Black-box Variational Inference through Stochastic Trust-Region Optimization
- Scalable Demand-Aware Recommendation
- SGD Learns the Conjugate Kernel Class of the Network
- Noise-Tolerant Interactive Learning Using Pairwise Comparisons
- Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems
- Generative Local Metric Learning for Kernel Regression
- Information Theoretic Properties of Markov Random Fields, and their Algorithmic Applications
- Fitting Low-Rank Tensors in Constant Time
- Deep supervised discrete hashing
- Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation
- How regularization affects the critical points in linear networks
- Fisher GAN
- Information-theoretic analysis of generalization capability of learning algorithms
- Sparse Approximate Conic Hulls
- Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems
- Toward Goal-Driven Neural Network Models for the Rodent Whisker-Trigeminal System
- Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM
- EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
- Multitask Spectral Learning of Weighted Automata
- Multi-way Interacting Regression via Factorization Machines
- Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network
- Practical Data-Dependent Metric Compression with Provable Guarantees
- REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models
- Nonlinear random matrix theory for deep learning
- Parallel Streaming Wasserstein Barycenters
- ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
- Dual Discriminator Generative Adversarial Nets
- Dynamic Revenue Sharing
- Decomposition-Invariant Conditional Gradient for General Polytopes with Line Search
- Multi-agent Predictive Modeling with Attentional CommNets
- An Empirical Bayes Approach to Optimizing Machine Learning Algorithms
- Differentially Private Empirical Risk Minimization Revisited: Faster and More General
- Variational Inference via \chi Upper Bound Minimization
- On Quadratic Convergence of DC Proximal Newton Algorithm in Nonconvex Sparse Learning
- #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
- An Empirical Study on The Properties of Random Bases for Kernel Methods
- Bridging the Gap Between Value and Policy Based Reinforcement Learning
- Premise Selection for Theorem Proving by Deep Graph Embedding
- A Bayesian Data Augmentation Approach for Learning Deep Models
- Principles of Riemannian Geometry in Neural Networks
- Cold-Start Reinforcement Learning with Softmax Policy Gradients
- Online Dynamic Programming
- Alternating Estimation for Structured High-Dimensional Multi-Response Models
- Convolutional Gaussian Processes
- Estimation of the covariance structure of heavy-tailed distributions
- Mean Field Residual Networks: On the Edge of Chaos
- Decomposable Submodular Function Minimization: Discrete and Continuous
- Gauging Variational Inference
- Deep Recurrent Neural Network-Based Identification of Precursor microRNAs
- Robust Estimation of Neural Signals in Calcium Imaging
- State Aware Imitation Learning
- Beyond Parity: Fairness Objectives for Collaborative Filtering
- A PAC-Bayesian Analysis of Randomized Learning with Application to Stochastic Gradient Descent
- Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach
- Model-Powered Conditional Independence Test
- Deep Voice 2: Multi-Speaker Neural Text-to-Speech
- Variance-based Regularization with Convex Objectives
- Deep Lattice Networks and Partial Monotonic Functions
- Continual Learning with Deep Generative Replay
- AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms
- Learning Causal Structures Using Regression Invariance
- Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback
- Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem
- Reinforcement Learning under Model Mismatch
- Hierarchical Attentive Recurrent Tracking
- Tomography of the London Underground: a Scalable Model for Origin-Destination Data
- Rotting Bandits
- Unbiased estimates for linear regression via volume sampling
- An Applied Algorithmic Foundation for Hierarchical Clustering
- Adaptive Accelerated Gradient Converging Method under H\"{o}lderian Error Bound Condition
- Stein Variational Gradient Descent as Gradient Flow
- Partial Hard Thresholding: A Towards Unified Analysis of Support Recovery
- Shallow Updates for Deep Reinforcement Learning
- A Highly Efficient Gradient Boosting Decision Tree
- Adversarial Ranking for Language Generation
- Regret Minimization in MDPs with Options without Prior Knowledge
- Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee
- Graph Matching via Multiplicative Update Algorithm
- Dynamic Importance Sampling for Anytime Bounds of the Partition Function
- Is the Bellman residual a bad proxy?
- Generalization Properties of Learning with Random Features
- Differentially private Bayesian learning on distributed data
- Learning to Compose Domain-Specific Transformations for Data Augmentation
- Wasserstein Learning of Deep Generative Point Process Models
- Ensemble Sampling
- Language modeling with recurrent highway hypernetworks
- Searching in the Dark: Practical SVRG Methods under Error Bound Conditions with Guarantee
- Bayesian Compression for Deep Learning
- Streaming Sparse Gaussian Process Approximations
- VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning
- Sparse k-Means Embedding
- Utile Context Tree Weighting
- A Regularized Framework for Sparse and Structured Neural Attention
- Multi-output Polynomial Networks and Factorization Machines
- Clustering Billions of Reads for DNA Data Storage
- Multi-Objective Non-parametric Sequential Prediction
- A Universal Analysis of Large-Scale Regularized Least Squares Solutions
- Deep Sets
- ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events
- Process-constrained batch Bayesian optimisation
- Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes
- Spherical convolutions and their application in molecular modelling
- Efficient Optimization for Linear Dynamical Systems with Applications to Clustering and Sparse Coding
- On Optimal Generalizability in Parametric Learning
- Near Optimal Sketching of Low-Rank Tensor Regression
- Tractability in Structured Probability Spaces
- Model-based Bayesian inference of neural activity and connectivity from all-optical interrogation of a neural circuit
- Gaussian process based nonlinear latent structure discovery in multivariate spike train data
- Neural system identification for large populations separating "what" and "where"
- Certified Defenses for Data Poisoning Attacks
- Eigen-Distortions of Hierarchical Representations
- Limitations on Variance-Reduction and Acceleration Schemes for Finite Sums Optimization
- Unsupervised Sequence Classification using Sequential Output Statistics
- Subset Selection under Noise
- Collecting Telemetry Data Privately
- Concrete Dropout
- Adaptive Batch Size for Safe Policy Gradients
- A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning
- PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference
- Bayesian GANs
- Off-policy evaluation for slate recommendation
- A multi-agent reinforcement learning model of common-pool resource appropriation
- On the Optimization Landscape of Tensor Decompositions
- High-Order Attention Models for Visual Question Answering
- Sparse convolutional coding for neuronal assembly detection
- Quantifying how much sensory information in a neural code is relevant for behavior
- Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks
- Reducing Reparameterization Gradient Variance
- Visual Reference Resolution using Attention Memory for Visual Dialog
- Joint distribution optimal transportation for domain adaptation
- Multiresolution Kernel Approximation for Gaussian Process Regression
- Collapsed variational Bayes for Markov jump processes
- Universal consistency and minimax rates for online Mondrian Forest
- Efficiency Guarantees from Data
- Diving into the shallows: a computational perspective on large-scale shallow learning
- End-to-end Differentiable Proving
- Influence Maximization with \varepsilon-Almost Submodular Threshold Function
- Inferring The Latent Structure of Human Decision-Making from Raw Visual Inputs
- Variational Laws of Visual Attention for Dynamic Scenes
- Recursive Sampling for the Nystrom Method
- Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
- Dynamic Routing Between Capsules
- Incorporating Side Information by Adaptive Convolution
- Conic Scan Coverage algorithm for nonparametric topic modeling
- FALKON: An Optimal Large Scale Kernel Method
- Structured Generative Adversarial Networks
- Conservative Contextual Linear Bandits
- Variational Memory Addressing in Generative Models
- On Tensor Train Rank Minimization : Statistical Efficiency and Scalable Algorithm
- Scalable Levy Process Priors for Spectral Kernel Learning
- Deep Hyperspherical Learning
- Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction
- On-the-fly Operation Batching in Dynamic Computation Graphs
- Nonlinear Acceleration of Stochastic Algorithms
- Optimized Pre-Processing for Discrimination Prevention
- YASS: Yet Another Spike Sorter
- Independence clustering (without a matrix)
- Fast amortized inference of neural activity from calcium imaging data with variational autoencoders
- Adaptive Active Hypothesis Testing under Limited Information
- Streaming Weak Submodularity: Interpreting Neural Networks on the Fly
- Successor Features for Transfer in Reinforcement Learning
- Counterfactual Fairness
- Prototypical Networks for Few-shot Learning
- Triple Generative Adversarial Nets
- Efficient Sublinear-Regret Algorithms for Online Sparse Linear Regression
- Mapping distinct timescales of functional interactions among brain networks
- Multi-Armed Bandits with Metric Movement Costs
- Learning A Structured Optimal Bipartite Graph for Co-Clustering
- Learning Low-Dimensional Metrics
- The Marginal Value of Adaptive Gradient Methods in Machine Learning
- Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification
- Deconvolutional Paragraph Representation Learning
- Random Permutation Online Isotonic Regression
- A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
- Inverse Filtering for Hidden Markov Models
- Non-parametric Neural Networks
- Learning Active Learning from Data
- VAE Learning via Stein Variational Gradient Descent
- Deep adversarial neural decoding
- Efficient Use of Limited-Memory Resources to Accelerate Linear Learning
- Temporal Coherency based Criteria for Predicting Video Frames using Deep Multi-stage Generative Adversarial Networks
- Sobolev Training for Neural Networks
- Multi-Information Source Optimization
- Deep Reinforcement Learning from Human Preferences
- On the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks
- Policy Gradient With Value Function Approximation For Collective Multiagent Planning
- Adversarial Symmetric Variational Autoencoder
- Tensor encoding and decomposition of brain connectomes with application to tractography evaluation
- A Minimax Optimal Algorithm for Crowdsourcing
- Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach
- A Decomposition of Forecast Error in Prediction Markets
- Safe Adaptive Importance Sampling
- Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net
- Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication
- Unsupervised Learning of Disentangled Representations from Video
- Federated Multi-Task Learning
- Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?
- The Expxorcist: Nonparametric Graphical Models Via Conditional Exponential Densities
- Improved Graph Laplacian via Geometric Self-Consistency
- Dual Path Networks
- Faster and Non-ergodic O(1/K) Stochastic Alternating Direction Method of Multipliers
- A Probabilistic Framework for Nonlinearities in Stochastic Neural Networks
- DisTraL: Robust multitask reinforcement learning
- Online Learning of Optimal Bidding Strategy in Repeated Multi-Commodity Auctions
- Trimmed Density Ratio Estimation
- Training recurrent networks to generate hypotheses about how the brain solves hard navigation problems
- Visual Interaction Networks
- Reconstruct & Crush Network
- Streaming Robust Submodular Maximization:A Partitioned Thresholding Approach
- Simple strategies for recovering inner products from coarsely quantized random projections
- Discovering Potential Influence via Information Bottleneck
- Doubly Stochastic Variational Inference for Deep Gaussian Processes
- Ranking Data with Continuous Labels through Oriented Recursive Partitions
- Scalable Model Selection for Belief Networks
- Targeting EEG/LFP Synchrony with Neural Nets
- Near-Optimal Edge Evaluation in Explicit Generalized Binomial Graphs
- Non-Stationary Spectral Kernels
- Overcoming Catastrophic Forgetting by Incremental Moment Matching
- Balancing information exposure in social networks
- SafetyNets: Verifiable Execution of Deep Neural Networks on an Untrusted Cloud
- Query Complexity of Clustering with Side Information
- QMDP-Net: Deep Learning for Planning under Partial Observability
- Robust Optimization for Non-Convex Objectives
- Thy Friend is My Friend: Iterative Collaborative Filtering for Sparse Matrix Estimation
- Adaptive Classification for Prediction Under a Budget
- Convergence rates of a partition based Bayesian multivariate density estimation method
- Affine-Invariant Online Optimization
- Beyond Worst-case: A Probabilistic Analysis of Affine Policies in Dynamic Optimization
- A unified approach to interpreting model predictions
- Stochastic Approximation for Canonical Correlation Analysis
- Investigating the learning dynamics of deep neural networks using random matrix theory
- Sample and Computationally Efficient Learning Algorithms under S-Concave Distributions
- Scalable Variational Inference for Dynamical Systems
- Context Selection for Embedding Models
- Working hard to know your neighbor's margins: Local descriptor learning loss
- Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex
- Multi-Task Learning for Contextual Bandits
- Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon
- Accelerated First-order Methods for Geodesically Convex Optimization on Riemannian Manifolds
- Selective Classification for Deep Neural Networks
- Minimax Estimation of Bandable Precision Matrices
- Monte-Carlo Tree Search by Best Arm Identification
- Group Additive Structure Identification for Kernel Nonparametric Regression
- Fast, Sample-Efficient Algorithms for Structured Phase Retrieval
- Hash Embeddings for Efficient Word Representations
- Online Learning for Multivariate Hawkes Processes
- Maximum Margin Interval Trees
- DropoutNet: Addressing Cold Start in Recommender Systems
- A simple neural network module for relational reasoning
- Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes
- Online Reinforcement Learning in Stochastic Games
- Position-based Multiple-play Multi-armed Bandit Problem with Unknown Position Bias
- Active Exploration for Learning Symbolic Representations
- Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling
- Fair Clustering Through Fairlets
- Polynomial time algorithms for dual volume sampling
- Hindsight Experience Replay
- Stochastic and Adversarial Online Learning without Hyperparameters
- Teaching Machines to Describe Images with Natural Language Feedback
- Perturbative Black Box Variational Inference
- GibbsNet: Iterative Adversarial Inference for Deep Graphical Models
- PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
- Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization
- Learning Graph Embeddings with Embedding Propagation
- Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes
- A-NICE-MC: Adversarial Training for MCMC
- Excess Risk Bounds for the Bayes Risk using Variational Inference in Latent Gaussian Models
- Real-Time Bidding with Side Information
- Saliency-based Sequential Image Attention with Multiset Prediction
- Variational Inference for Gaussian Process Models with Linear Complexity
- K-Medoids For K-Means Seeding
- Identifying Outlier Arms in Multi-Armed Bandit
- Online Learning with Transductive Regret
- Riemannian approach to batch normalization
- Self-supervised Learning of Motion Capture
- Triangle Generative Adversarial Networks
- Preserving Proximity and Global Ranking for Node Embedding
- Bayesian Optimization with Gradients
- Second-order Optimization in Deep Reinforcement Learning using Kronecker-factored Approximation
- Renyi Differential Privacy Mechanisms for Posterior Sampling
- Online Learning with a Hint
- Identification of Gaussian Process State Space Models
- Robust Imitation of Diverse Behaviors
- Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent
- Local Aggregative Games
- A Sample Complexity Measure with Applications to Learning Optimal Auctions
- Thinking Fast and Slow with Deep Learning and Tree Search
- EEG-GRAPH: A Factor Graph Based Model for Capturing Spatial, Temporal, and Observational Relationships in Electroencephalograms
- Improving the Expected Improvement Algorithm
- Hybrid Reward Architecture for Reinforcement Learning
- Approximate Supermodularity Bounds for Experimental Design
- Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification
- AdaGAN: Boosting Generative Models
- Straggler Mitigation in Distributed Optimization Through Data Encoding
- Multi-View Decision Processes
- A Greedy Approach for Budgeted Maximum Inner Product Search
- SVD-Softmax: Fast Softmax Approximation on Large Vocabulary Neural Networks
- Plan, Attend, Generate: Planning for Sequence-to-Sequence Models
- Task-based End-to-end Model Learning in Stochastic Optimization
- Towards Understanding Adversarial Learning for Joint Distribution Matching
- Finite sample analysis of the GTD Policy Evaluation Algorithms in Markov Setting
- On the Complexity of Learning Neural Networks
- Hierarchical Implicit Models and Likelihood-Free Variational Inference
- Improved Semi-supervised Learning with GANs using Manifold Invariances
- Approximation and Convergence Properties of Generative Adversarial Learning
- From Bayesian Sparsity to Gated Recurrent Nets
- Min-Max Propagation
- What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?
- Gradient descent GAN optimization is locally stable
- Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks
- Dualing GANs
- Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model
- Do Deep Neural Networks Suffer from Crowding?
- Learning from Complementary Labels
- More powerful and flexible rules for online FDR control with memory and weights
- Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes
- Discriminative State Space Models
- On Fairness and Calibration
- Imagination-Augmented Agents for Deep Reinforcement Learning
- Extracting low-dimensional dynamics from multiple large-scale neural population recordings by learning to predict correlations
- Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
- Gradients of Generative Models for Improved Discriminative Analysis of Tandem Mass Spectra
- Asynchronous Parallel Coordinate Minimization for MAP Inference
- Multiscale Quantization for Fast Similarity Search
- Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space
- Improved Training of Wasserstein GANs
- Optimally Learning Populations of Parameters
- Clustering with Noisy Queries
- Higher-Order Total Variation Classes on Grids: Minimax Theory and Trend Filtering Methods
- Training Quantized Nets: A Deeper Understanding
- Permutation-based Causal Inference Algorithms with Interventions
- Time-dependent spatially varying graphical models, with application to brain fMRI data analysis
- Gradient Methods for Submodular Maximization
- Smooth Primal-Dual Coordinate Descent Algorithms for Nonsmooth Convex Optimization
- Maximizing the Spread of Influence from Training Data
- Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos
- Learning Neural Representations of Human Cognition across Many fMRI Studies
- A KL-LUCB algorithm for Large-Scale Crowdsourcing
- Collaborative Deep Learning in Fixed Topology Networks
- Fast-Slow Recurrent Neural Networks
- Learning Disentangled Representations with Semi-Supervised Deep Generative Models
- Learning to Generalize Intrinsic Images with a Structured Disentangling Autoencoder
- Exploring Generalization in Deep Learning
- A framework for Multi-A(rmed)/B(andit) Testing with Online FDR Control
- Fader Networks: Generating Image Variations by Sliding Attribute Values
- Action Centered Contextual Bandits
- Estimating Mutual Information for Discrete-Continuous Mixtures
- Attention is All you Need
- Recurrent Ladder Networks
- Parameter-Free Online Learning via Model Selection
- Bregman Divergence for Stochastic Variance Reduction: Saddle-Point and Adversarial Prediction
- Unbounded cache model for online language modeling with open vocabulary
- Predictive State Recurrent Neural Networks
- Early stopping for kernel boosting algorithms: A general analysis with localized complexities
- SVCCA: Singular Vector Canonical Correlation Analysis for Deep Understanding and Improvement
- Convolutional Phase Retrieval
- Estimating High-dimensional Non-Gaussian Multiple Index Models via Stein’s Lemma
- Gaussian Quadrature for Kernel Features
- Value Prediction Network
- On Learning Errors of Structured Prediction with Approximate Inference
- Efficient Second-Order Online Kernel Learning with Adaptive Embedding
- Implicit Regularization in Matrix Factorization
- Optimal Shrinkage of Singular Values Under Random Data Contamination
- Delayed Mirror Descent in Continuous Games
- Asynchronous Coordinate Descent under More Realistic Assumptions
- Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls
- Hierarchical Clustering Beyond the Worst-Case
- Invariance and Stability of Deep Convolutional Representations
- Statistical Cost Sharing
- The Expressive Power of Neural Networks: A View from the Width
- Spectrally-normalized margin bounds for neural networks
- Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes
- Population Matching Discrepancy and Applications in Deep Learning
- Scalable Planning with Tensorflow for Hybrid Nonlinear Domains
- Boltzmann Exploration Done Right
- Towards the ImageNet-CNN of NLP: Pretraining Sentence Encoders with Machine Translation
- Neural Discrete Representation Learning
- Generalizing GANs: A Turing Perspective
- Scalable Log Determinants for Gaussian Process Kernel Learning
- Poincaré Embeddings for Learning Hierarchical Representations
- Learning Combinatorial Optimization Algorithms over Graphs
- Robust Conditional Probabilities
- Learning with Bandit Feedback in Potential Games
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
- Communication-Efficient Distributed Learning of Discrete Distributions
- Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
- When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness
- Matrix Norm Estimation from a Few Entries
- Deep Networks for Decoding Natural Images from Retinal Signals
- Causal Effect Inference with Deep Latent Variable Models
- Learning Identifiable Gaussian Bayesian Networks in Polynomial Time and Sample Complexity
- Gradient Episodic Memory for Continuum Learning
- Radon Machines: Effective Parallelisation for Machine Learning
- Semisupervised Clustering, AND-Queries and Locally Encodable Source Coding
- Clustering Stable Instances of Euclidean k-means.
- Good Semi-supervised Learning That Requires a Bad GAN
- On Blackbox Backpropagation and Jacobian Sensing
- Protein Interface Prediction using Graph Convolutional Networks
- Solid Harmonic Wavelet Scattering: Predicting Quantum Molecular Energy from Invariant Descriptors of 3D Electronic Densities
- Towards Generalization and Simplicity in Continuous Control
- Random Projection Filter Bank for Time Series Data
- Filtering Variational Objectives
- On Frank-Wolfe and Equilibrium Computation
- Modulating early visual processing by language
- Learning Mixture of Gaussians with Streaming Data
- Practical Hash Functions for Similarity Estimation and Dimensionality Reduction
- Two Time-Scale Update Rule for Generative Adversarial Nets
- The Scaling Limit of High-Dimensional Online Independent Component Analysis
- Approximation Algorithms for \ell_0-Low Rank Approximation
- The power of absolute discounting: all-dimensional distribution estimation
- Supervised Adversarial Domain Adaptation
- Spectral Mixture Kernels for Multi-Output Gaussian Processes
- Neural Expectation Maximization
- Online Learning of Linear Dynamical Systems
- Z-Forcing: Training Stochastic Recurrent Networks
- Thalamus Gated Recurrent Modules
- Neural Variational Inference and Learning in Undirected Graphical Models
- Subspace Clustering via Tangent Cones
- The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process
- Inverse Reward Design
- Structured Bayesian Pruning via Log-Normal Multiplicative Noise
- Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin
- Acceleration and Averaging in Stochastic Descent Dynamics
- Kernel functions based on triplet comparisons
- An Error Detection and Correction Framework for Connectomics
- Style Transfer from Non-parallel Text by Cross-Alignment
- Cross-Spectral Factor Analysis
- Stochastic Submodular Maximization: The Case of Coverage Functions
- On Distributed Hierarchical Clustering
- Unsupervised Transformation Learning via Convex Relaxations
- A Sharp Error Analysis for the Fused Lasso, with Implications to Broader Settings and Approximate Screening
- Efficient Computation of Moments in Sum-Product Networks
- A Meta-Learning Perspective on Cold-Start Recommendations for Items
- Predicting Scene Parsing and Motion Dynamics in the Future
- Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference
- Efficient Approximation Algorithms for Strings Kernel Based Sequence Classification
- Kernel Feature Selection via Conditional Covariance Minimization
- Statistical Convergence Analysis of Gradient EM on General Gaussian Mixture Models
- Real Time Image Saliency for Black Box Classifiers
- Houdini: Democratizing Adversarial Examples
- Efficient and Flexible Inference for Stochastic Systems
- When Cyclic Coordinate Descent Beats Randomized Coordinate Descent
- Active Learning from Peers
- Learning Causal Graphs with Latent Variables
- Learning to Model the Tail
- Stochastic Mirror Descent for Non-Convex Optimization
- On Separability of Loss Functions, and Revisiting Discriminative Vs Generative Models
- Maxing and Ranking with Few Assumptions
- On clustering network-valued data
- A General Framework for Robust Interactive Learning
- Multi-view Matrix Factorization for Linear Dynamical System Estimation
请关注专知公众号
- 后台回复“nips2017” 就可以获取资料pdf下载链接