Scaling up Tools for Analyzing Truly Massive Social Networks
AbstractAs social networks continue to increase in popularity, the amount of data available to researchers and practitioners is growing exponentially. This data is incredibly valuable to computational sociologists, criminologists, and security specialists, who need to compute small but interesting features in order to analyze these networks and model their genesis. Unfortunately, much of the available software for analyzing these networks is only able to run on small-to-large networks---whereas true social networks are “massive” and cannot even fit into the memory of a single machine. Even when software can handle such large networks, it is typically slow.
Together, we will design and build state of the art software to analyze truly massive social networks. We will first investigate standard techniques such decomposing the data into manageable pieces; storing most of the data on disk and loading only what is necessary into memory; and parallelism---but we will also investigate lesser known techniques such as graph reductions. Aside from all the new techniques you will learn from this project, an added benefit is that we will release the software that we write together as an open source package that you can include in your portfolio and show potential employers.
For further information, here are some keywords that best describe the project’s content: graph partitioning, k-core, k-connected components, external memory algorithms, combinatorial optimization, combinatorial scientific computing, subgraph isomorphism, motif search, subgraph enumeration, minimum vertex cover, and maximum clique.
Student QualificationsIntermediate to advanced proficiency in C/C++ or Java programming languages.
Must have taken and passed COSC 102; additional experience with algorithms, or having taken COSC 302, is a plus.
Number of Student Researchers2 students
Project Length8 weeks
Applications open on 01/15/2017 and close on 02/07/2017