Current research projects are listed alphabetically by project title.
The focus of this three-year, multisite project is development of app-base curricula and tools for user in school and public libraries. These tools will teach children aged eight to twelve how to build their own apps, providing them with early programming experience, and allow them to share their creations with other children. The project further establishes libraries as places to engage youth in STEM exploration and digital development that reflects their own experiences.
This project builds on work conducted with support from a planning-phase grant from the Institute of Museum and Library Services titled, "Closing the App Gap."
FundingInstitute of Museum and Library Services — $248,205
This project expands Tilley’s investigation of comics from the perspective of readers, a much-neglected group in both contemporary and historical research. Comics readership among young people peaked in the mid-twentieth century with levels reaching near 100%, yet there has been little scholarly investigation of this phenomenon. Funding for this project will enable archival research trips and hourly research support to complete data collection necessary for a single-author monograph that will provide a coherent examination of the social and cultural role of comics in United States’s children’s print culture throughout the twentieth century.
FundingUniversity of Illinois Research Board — $19,036
Films are produced, screened and perceived as part of a larger and continuously changing ecosystem that involves multiple stakeholders and themes. This project will measure the impact of social justice documentaries by capturing, modeling and analyzing the map of these stakeholders and themes in a systematic, scalable and analytically rigorous fashion. This solution will result in a validated, re-useable and end-user friendly methodology and technology that practitioners can use to assess the long-term impact of media productions beyond the number of people who have seen a screening or visited a webpage. Moreover, bringing the proposed computational methodology into a real-world application context can serve as a case-study for demonstrating the usability of this cutting-edge solution...
FundingFord Foundation — $150,000
Data Observation Network for Earth (DataONE) is a collaborative, global project that is laying the groundwork for a new, innovative approach to conducting environmental science research. DataONE is a distributed framework and sustainable infrastructue poised to resolve many of the key challenges that hinder the realization of more global, open, and reproducible science, through four interrelated cyberinfrastructure (CI) activities:
FundingNational Science Foundation — $204,132
The United States is a world leader in technological innovation. However, as our technology has advanced, the need for cyber security experts has increased dramatically. Unfortunately, the U.S. lacks the cyber security workforce needed to manage many of the threats our society faces.
One method used to attract talented individuals to careers in cyber security has been the organization of cyber security competitions. Such contests aim to train the next generation of cyber security specialists using hands-on competition. Examining the overall effectiveness of cyber security competitions and expanding our understanding of the individuals who participate are keys to future success in cyber security recruitment.
FundingNational Science Foundation — $161,708
This project, conducted collaboratively by GSLIS and the University Library, will further our understanding of four translational research questions:
FundingAndrew W. Mellon Foundation — $247,675
The HathiTrust Research Center (HTRC) is partnering with the Cultural Observatory team that developed the Google Books Ngram Viewer together with Google. The goal of this collaboration is to implement a greatly enhanced open-source version of the Cultural Observatory’s open-source “Bookworm” text analysis and visualization tool designed to assist scholars to meet the challenges posed by the massive scale of the HT corpus. We are calling our multi-disciplinary, multi-institutional collaboration, the HathiTrust + Bookworm (HT+BW) Project. Participating institutions include the University of Illinois, Indiana University, Northeastern University, Baylor College of Medicine, and Rice University.
FundingNational Endowment for the Humanities — $504,373
The HathiTrust has provided funding for the HathiTrust Research Center (HTRC), colocated at University of Illinois and Indiana University, to serve as the research arm of the HathiTrust and create an agile, technology-rich service for researchers in the digital humanities, social sciences, natural sciences, and informatics. This service will help researchers conduct nonconsumptive research on the HathiTrust digital library database, a collection of just under 14 million digitized volumes, equating to 4.9 billion pages, 60% of which is under some copyright restriction. At the same time, center staff will develop and refine tools to aid in digital humanities and text mining research over large databases and will operate the secure, large-scale computation environment required by this...
FundingHathiTrust — $1,000,000
The Illinois Digital Innovation Leadership Program will increase opportunities for entrepreneurship, economic development, and innovation through the expansion of digital manufacturing, digital media production, and data analytics. Supported by the University of Illinois Extension, the project will engage Illinoisans with mobile digital design and innovation labs, or “DigiTech Hubs,” which will serve as high-tech inventor workshops equipped with tools for everything from audio production to 3D printing. Digital Innovation Leadership staff will work with 4-H clubs, public libraries, and public schools to develop permanent community-based and -supported studios, creating a network that will build statewide capacity in digital...
FundingUniversity of Illinois Extension — $300,000
Notably absent in the current rush to digitize newspapers and books are critical investigations of the processes and products of this work. Such examinations are forestalled, Mak argues, by a rhetoric of revolution that determines how the phenomenon should be constituted and studied, just as it continues to do for the so-called printing revolution of the fifteenth century. Her analysis of digitizations exposes the ways in which historical sources are being reconfigured for digital transmission as part of the “information revolution,” and considers the consequences of this reconfiguration for humanities scholarship, cultural heritage, and the making of meaning.
FundingIllinois Program for Research in the Humanities — $14,000
GSLIS faculty members Catherine Blake and Michael Twidale are working as expert advisors to the US Department of Veterans Affairs (VA) Information Resource Center (VIReC) on a project to analyze the socio-technical aspects of VA’s HSRData-L Listserv. VIReC is a VA Health Service Research & Development Service (HSR&D) resource center that supports VA researchers in need of information about data resources specific to their research. HSRData-L is a virtual community of VA researchers who share their collective knowledge and experience about VA data and information systems for the betterment of research focused on Veteran’s issues. The team is led at the VA by Maria Souden, VIReC associate director for communications. GSLIS doctoral student Caryn Anderson, who has worked...
FundingU.S. Department of Veterans Affairs — $67,186
U.S. Department of Veterans Affairs — $95,767
In order for older texts to be searchable, contemporary English needs to be translated into language from various historical timeframes. The project will develop software that will let people enter a query in contemporary English, and search over English texts throughout history—from Medieval times to the present day. The project will mostly involve training statistical models that assign probabilities of the translation to a word or phrase in a target English language. The project will also look at how to display results in order to provide the user with the most probable answer to the query.
FundingGoogle — $49,429
Microblogging services like Twitter are becoming an important part of how many people manage information in their day to day activities. As microblog traffic increases (Twitter currently sees about 50 million tweets per day) information management and organization will become keen problems in this area. The project will define the core problems in microblog search and propose solutions to these challenges in the form of both theoretical models and prototype search systems.
FundingGoogle — $45,563
Google — $22,714
Diesner’s team is developing a natural-language processing solution for probabilistic entity detection and classification in the domain of healthcare. The core of the solution are prediction models built by using supervised and/or semi-supervised machine learning techniques. The resulting models can be used to annotate natural language text data documents for entity classes. The team will perform fact extraction from medical text data documents as well as map tokens to predefined medical codes. Both tasks involve the same steps: 1) building and evaluating prediction models, 2) helping to integrate the prediction models into IMO’s workflow, 3) building an inference engine for practical applications, 4) building a technical solution with which IMO can update the prediction models, and 5...
FundingIntelligent Medical Objects — $299,998
Assistant Professor Jana Diesner a received an Faculty Fellowship and seed funding for her project, “Predictive Modeling for Impact Assessment,” from the National Center for Supercomputing Applications (NCSA). Diesner collaborates closely with NCSA scientists on the project, which builds on her work developing computational solutions to assess the impact of issue-focused information projects such as social justice documentaries and books. Her research team leverages big social data for this purpose and combines techniques from machine learning and natural language processing to identify a fine-grained set of impact factors from textual data sources such as news articles, reviews, and social media. This project aims to locate...
FundingNational Center for Supercomputing Applications — $24,323
The past decade has seen tremendous progress in the field of preservation, particularly with respect to preservation of digital materials. To date, however, there has been only minimal research activity within North America on the preservation of intangible cultural heritage—such as language, cuisine, performing arts, and traditional craftsmanship—and its relationship to the preservation of material expressions of culture. Given the importance of intangible heritage to the cultural and scholarly record, a more significant research program in this area would be of benefit to the scholarly community. In order to launch such a research program, the investigators believe it would be helpful to organize a meeting of individuals and organizations with a strong interest in the preservation of...
FundingAndrew W. Mellon Foundation — $25,500
This project aims to improve search engine effectiveness by using knowledge base (KB) entries to inform query expansion. While the intersection of KBs and information retrieval (IR) is a growing research area, this project proposes a novel approach to KB-based query modeling. In particular, this project proposes to let the structure that KB authors impose within individual KB entries guide the final query model. For instance, authors of Wikipedia pages divide individual entries into sections, subsections, bulleted lists, etc. The goal of this project is to use such intra-entity structure to derive highly focused query models. This project is a collaboration between Miles Efron, his Google sponsor, and a PhD student with the goal of advancing the state of the art in using structured...
FundingGoogle — $22,130
Taxonomists are scientists who describe the world’s biodiversity. These descriptions of millions of species allow scientists to do many different kinds of research, including basic biology, environmental science, climate research, agriculture, and medicine. The problem is that describing any one species is not easy. The language used by taxonomists to describe their data is complex, and typically not easily understandable by computers nor even other scientists. This situation makes it harder to search for patterns across millions of species documented by thousands of researchers over many decades of work worldwide.
FundingNational Science Foundation — $292,306
Across the country, colleges and universities are struggling to meet demand for accessible forms of course materials for students with an array of disabilities. At present, each institution is addressing this problem individually, at great expense, and often without full campus coordination, much less consortial collaboration. Locating digital files is difficult and entails numerous sources. The resulting accessibility enhancement/conversion work creates a large corpus of digital files in varying forms to manage on each campus. Over the course of one year, this planning project will bring together experts from disability/accessibility services with librarians, IT professionals, advocates, and legal counsel, to develop shared infrastructure within which universities can support their...
FundingInstitute of Museum and Library Services — $7,173
Music prints and manuscripts created over the past thousand years sit on the shelves of libraries and museums around the globe. As these organizations digitize their collections, images of these scores are increasingly accessible online. However, the musical content remains difficult to search.
Google Books and HathiTrust have already made it possible to search the content of text documents through Optical Character Recognition (OCR), which transforms digital images of texts into a symbolic representation that can be searched by computers. For digital images of musical scores, the analogous technology is Optical Music Recognition (OMR).
FundingSocial Sciences and Humanities Research Council of Canada — $15,000
This project will create both a master’s and doctoral-level specialization in Socio-technical Data Analytics (SODA). Partnerships with local researchers and businesses who already work with large data-sets will enable master's graduates to receive first-hand experience with both the social and technical implications of large digital data collections, and thus be well-prepared for leadership roles in academic and corporate environments. Similarly, doctoral students will consider multiple stages of the information lifecycle, which will help to ensure that their research findings will generalize to a range of scholarly and business practices.
FundingInstitute of Museum and Library Services — $498,777
Time affects information retrieval in many ways. Collections of documents change as new items are indexed. The content of documents themselves may change. Users submit queries at particular moments in time. And perhaps most importantly, people’s assessment of a document’s relevance to a query is often time-dependent. For example, searchers of news archives might seek information on a past event where relevant documents cluster in a window of time. Users of social media services such as Twitter demand topically relevant information that is new. People who monitor particular topics in the news (for example, editors of Wikipedia) take action when they find information that is topically relevant and that changes current knowledge. The traces of information created by change in documents,...
FundingNational Science Foundation — $408,908
This HathiTrust Research Center (HTRC) project seeks to produce the first large-scale cross-cultural study of the novel according to quantitative methods. Ever since its putative rise in the eighteenth century, the novel has emerged as a central means of expressing what it means to be modern. And yet despite this cultural significance, we still lack a comprehensive study of the novel’s place within society that accounts for the vast quantity of novels produced since the eighteenth century, the period most often identified as marking the origins of the novel’s quantitative rise. Our aim is thus twofold: 1) to enliven our understanding of one of the most culturally significant modern art forms according to new computational means, and 2) to establish the methodological foundations of a...
FundingSocial Sciences and Humanities Research Council of Canada — $48,000
The goal of this research is to help researchers develop and use relatively simple tools to describe species in a way that make those descriptions easier to share with other scientists and easier for computers to process and analyze. The approach is bottom-up and iterative, involving the rapid prototyping of tools, combining of existing tools, and the tailoring of applications developed for one purpose but now being reused for this scientific activity. Innovation from this project is applicable to the long-term development of open source software initiatives serving labs throughout the world. The project provides rich, real-world training for graduate students in library and information sciences, training them to be much needed cross-disciplinary researchers in a field desperate for...
FundingNational Science Foundation — $421,200
Despite the ubiquity of search in many people’s daily lives, a lack of search literacy can make it difficult to find solutions to technical problems, such as completing software-based tasks like troubleshooting program installations. GSLIS Professor Michael Twidale and Assistant Professor Max Wilson of the University of Nottingham have received funding from Google for a project that aims to develop an understanding of search literacy, and to recommend best practices for teaching technical search literacy and creating tools in support of this kind of search.
FundingGoogle — $65,000
“Understanding the Needs of Scholars in a Contemporary Publishing Environment,” better know as Publishing Without Walls (PWW), is a digital scholarly publishing initiative that is scholar-driven, openly accessible, scalable, and sustainable. PWW will directly engage with scholars throughout the research process. It aims to build publishing models that can be supported locally by a university’s library, while also opening new avenues toward publication through university presses and other publishers. PWW is here to help scholars navigate the new opportunities presented by collaborative, multimodal, and interim phase works.
FundingAndrew W. Mellon Foundation — $1,000,000
For over two millennia, librarians have played a critical role in the production and transmission of knowledge. They have helped to collect, catalogue, and curate a vast range of materials that constitute much of our cultural heritage—from epic poetry on papyrus scrolls to PDFs of scholarly articles. This project interrogates these practices by building a librarian's cabinet of curiosity, and populating it with explicit examples of the mundane activities that occur in and around the library.
FundingUniversity of Illinois Research Board — $6,500
This project examines the impact of different research funding structures on the training of future scientists, particularly graduate students and postdoctoral fellows, and the impact on their subsequent outcomes. Our proposed research begins by examining the way in which research (and most training) is funded and done. We classify projects by whether they are large or small scale (by funding size); multiple researchers; or multiple institutions. We construct different measures of project teams, and capture the subsequent trajectories of the students and postdoctoral fellows during and after their contact with the teams. We make use of a natural experiment and quasi experimental statistical techniques to separate the effect of funding structures from the other factors contributing to...