Overview

The proliferation of open-source projects has led to large amounts of source code and related artifacts: arguably, the rich and open resources associated with software--including open source repositories, Q/A sites, change histories, and communications between developers--are the richest and most detailed information resource for any technical area. Recently it has been discovered that “natural”, human-produced software has many interesting statistical regularities. As a consequence code corpora, just like natural language corpora, are amenable to statistical modeling, and a number of software tasks such as coding, testing, porting, bug-patching etc are potentially enhanced by the use of these statistical models.

This interdisciplinary workshop will explore issues related to the statistical modeling of software corpora, including topics such as: modeling repetitiveness in source code; use of language models for the code suggestion in IDEs; using probabilistic grammars to mine programming idioms; statistical methods for type inference in a dynamically typed languages; statistical machine translation for porting applications between programming languages, or “mini-fying”Javascript; using statistical language models to find bugs; or statistical methods for automatic code patching, code summarization, code retrieval, code annotation, or test generation.

The workshop follows several earlier workshops on this topic at Microsoft Research, Dagstuhl event, SIGSOFT FSE, and AAAI.

Call for papers

Download and print our call for papers!

This interdisciplinary workshop will explore issues related to the statistical modeling of software corpora, including topics such as:

  • Modeling repetitiveness ("naturalness") in source code
  • Applications to code suggestion in IDEs
  • Mining programming idioms
  • Statistical inference of types and other annotations
  • Applications of Statistical Machine Translation for porting and reverse engineering
  • Statistical methods for bug localization
  • Statistical methods for automatic code patching, code summarization, code retrieval, code annotation, or test generation
  • Formal and informal methods for enhancing assurance via NLP techniques

We invite short position papers or early-stage research papers of at most 4 pages in length. Several submissions will be invited for presentation.

If you are looking for software corpora to study, please be aware that the Conala Corpus (curated by CMU) and NL2Bash Corpus (curated by the University of Washington) are available to use.

You are encouraged to use the dataset to demonstrate your technology.

Submission information

We invite contributions of up to four pages in length describing early-stage research or position papers. All submissions must be previously unpublished and not submitted elsewhere, and must conform to the ACM proceedings formatting guidelines as specified by ESEC/FSE.

Submission website via EasyChair

Important dates

August 31st, 2018
Paper submission deadline (AoE)
October 1st, 2018
Author notification
October 15th, 2018
Camera-ready deadline (AoE)

Program Committee


Organizing Committee
Yijun YuThe Open University (UK)
Erik FredericksOakland University
Prem DevanbuUniversity of California, Davis
 
Program Committee
Miltos AllamanisMicrosoft Cambridge (UK)
Earl BarrUniverisity College London (UK)
Marc BrockschmidtMicrosoft Cambridge (UK)
Satish ChandraFacebook, Inc. (USA)
William CohenCarnegie Mellon University (USA)
Premkumar DevanbuUniversity of California, Davis (USA)
Erik FredericksOakland University (USA)
Reihaneh HaririOakland University (USA)
Abram HindleUniversity of Alberta (Canada)
Sung KimHong Kong University of Science and Technology (HK)
Mark MarronMSR, WA (USA)
Graham NeubigCarnegie Mellon University (USA)
Michael PradelTU Darmstadt (Germany)
Baishakhi RayVirginia (USA)
Fayola PetersLero (Ireland)
Guangzhi QuOakland University (USA)
Charles SuttonUniversity of Edinburgh (UK)
Thein Than TunOpen University (UK)
Bogdan VasilescuCarnegie Mellon University (USA)
Martin VechevETH Zurich (Switzerland)
Xiaoyin WangUniversity of Texas St. Antonio (USA)
Alistair WillisOpen University (UK)
Yijun YuOpen University (UK)