Current research projects are listed alphabetically by project title.
Against the backdrop of a powerful desire for national modernization, the Long 1960s (c. 1955-1975) witnessed attempts to build, literally, a better post-war Britain. The unprecedented burst of building activity that marked the post-war years included the planning and construction of hundreds of public library buildings, clothed in a variety of modernist styles symbolic of the period's spirit of renewal. Described at the time as a "national health service for reading," public libraries assumed a prominent position in the post-war welfare state. Through analysis of extant buildings and primary source documents, the research will examine what modernist library design meant to librarians, architects, local politicians and planners, and the public. The research will contribute to recent...
FundingUniversity of Illinois Research Board — $16,250
This project allies with IMLS’s support of the Campaign for Grade-Level Reading with an exploration of the use of tablet computers, apps, and e-books in public libraries as a tool against summer reading loss. It will engage with experts in scholarship and practice to define the public library’s role in selecting and providing existing digital media for younger children, especially those primary-grades children in low-income communities who are most in need of intervention, whose access to media at home is limited, and for whom summer often means a loss of skills. This plan draws on both the historic involvement of public libraries in literacy through summer reading programs and ongoing support, and their long-term role as providers and facilitators for communities impacted by disparate...
FundingInstitute of Museum and Library Services — $46,678
DataONE (Observation Network for Earth) is now poised to resolve many of the key challenges that hinder the realization of more global, open, and reproducible science, through four interrelated cyberinfrastructure (CI) activities: (1) significantly expanding the volume and diversity of data available to researchers for large-scale scientific innovation and discovery; (2) incorporating innovative and high-value science-enabling features into the DataONE CI; (3) maintaining and improving core software and services; and (4) increasing the integration of loosely coupled nodes in the system while maintaining cybersecurity and trust. This project will be completed in conjunction with the University of New Mexico.
FundingNational Science Foundation — $204,132
The goal of Data Curation Education in Research Centers (DCERC) is to develop a sustainable and transferable model for educating Library and Information Science (LIS) masters and doctoral students in data curation through field experiences in research and data centers. DCERC will establish and implement a graduate research and education program in scientific data curation that will bring students into the real world of scientific data curation, where they will engage with current practices and challenges, and share their developing expertise and research in the area.
FundingInstitute of Museum and Library Services — $988,543
The Center for Informatics Research in Science and Scholarship at the University of Illinois will collaborate with the Maryland Institute for Technology in the Humanities at the University of Maryland and the Center for Digital Scholarship at Brown University to develop and conduct a series of advanced institutes on data curation for the digital humanities, to be held at the University of Illinois, Urbana-Champaign (Graduate School of Library and Information Science), the University of Maryland (Maryland Institute for Technology in the Humanities) and at Brown University (Center for Digital Scholarship).
FundingNational Endowment for the Humanities — $144,855
The United States is a world leader in technological innovation. However, as our technology has advanced, the need for cyber security experts has increased dramatically. Unfortunately, the U.S. lacks the cyber security workforce needed to manage many of the threats our society faces.
One method used to attract talented individuals to careers in cyber security has been the organization of cyber security competitions. Such contests aim to train the next generation of cyber security specialists using hands-on competition. Examining the overall effectiveness of cyber security competitions and expanding our understanding of the individuals who participate are keys to future success in cyber security recruitment.
FundingNational Science Foundation — $161,708
This project proposes to use supervised machine learning to build an entity extractor that is specifically designed for supporting the constructing of socio-technical network data. The resulting probabilistic prediction models and end-user technology are essential for being able to address substantive questions about real-world networks. The project team will make these outcomes publicly available to enable others to perform text coding projects, especially in the social sciences and humanities. We will also apply this extractor to multiple corpora for research projects.
FundingExtreme Science and Engineering Discovery Environment
The HathiTrust Research Center (HTRC) is partnering with the Cultural Observatory team that developed the Google Books Ngram Viewer together with Google. The goal of this collaboration is to implement a greatly enhanced open-source version of the Cultural Observatory’s open-source “Bookworm” text analysis and visualization tool designed to assist scholars to meet the challenges posed by the massive scale of the HT corpus. We are calling our multi-disciplinary, multi-institutional collaboration, the HathiTrust + Bookworm (HT+BW) Project. Participating institutions include the University of Illinois, Indiana University, Northeastern University, Baylor College of Medicine, and Rice University.
FundingNational Endowment for the Humanities — $504,373
The Illinois Digital Innovation Leadership Program will increase opportunities for entrepreneurship, economic development, and innovation through the expansion of digital manufacturing, digital media production, and data analytics. Supported by the University of Illinois Extension, the project will engage Illinoisans with mobile digital design and innovation labs, or “DigiTech Hubs,” which will serve as high-tech inventor workshops equipped with tools for everything from audio production to 3D printing. Digital Innovation Leadership staff will work with 4-H clubs, public libraries, and public schools to develop permanent community-based and -supported studios, creating a network that will build statewide capacity in digital design, manufacturing, and entrepreneurship.
FundingUniversity of Illinois Extension — $300,000
Notably absent in the current rush to digitize newspapers and books are critical investigations of the processes and products of this work. Such examinations are forestalled, Mak argues, by a rhetoric of revolution that determines how the phenomenon should be constituted and studied, just as it continues to do for the so-called printing revolution of the fifteenth century. Her analysis of digitizations exposes the ways in which historical sources are being reconfigured for digital transmission as part of the “information revolution,” and considers the consequences of this reconfiguration for humanities scholarship, cultural heritage, and the making of meaning.
FundingIllinois Program for Research in the Humanities — $14,000
In order for older texts to be searchable, contemporary English needs to be translated into language from various historical timeframes. The project will develop software that will let people enter a query in contemporary English, and search over English texts throughout history—from Medieval times to the present day. The project will mostly involve training statistical models that assign probabilities of the translation to a word or phrase in a target English language. The project will also look at how to display results in order to provide the user with the most probable answer to the query.
FundingGoogle — $49,429
Microblogging services like Twitter are becoming an important part of how many people manage information in their day to day activities. As microblog traffic increases (Twitter currently sees about 50 million tweets per day) information management and organization will become keen problems in this area. The project will define the core problems in microblog search and propose solutions to these challenges in the form of both theoretical models and prototype search systems.
FundingGoogle — $45,563
Google — $22,714
The “Music Information Retrieval Evaluation eXchange” (MIREX) is the annual cycle of events wherein music information retrieval (MIR) researchers come together to investigate how well their innovative MIR algorithms perform. MIREX has played a pivotal role in the growth and success of the MIR research community, evaluating over 1,068 algorithms across 23 unique MIR task categories. Notwithstanding its growth and success, the research landscape in which MIREX resides has evolved considerably over its lifetime. With developments in the field, MIREX is placed in a novel research context allowing it to undertake a process of reflection and regeneration in order to maximize its impact by building upon its strengths, ameliorating its shortcomings, and capitalizing on its new opportunities....
FundingAndrew W. Mellon Foundation — $390,088
This project aims to improve search engine effectiveness by using knowledge base (KB) entries to inform query expansion. While the intersection of KBs and information retrieval (IR) is a growing research area, this project proposes a novel approach to KB-based query modeling. In particular, this project proposes to let the structure that KB authors impose within individual KB entries guide the final query model. For instance, authors of Wikipedia pages divide individual entries into sections, subsections, bulleted lists, etc. The goal of this project is to use such intra-entity structure to derive highly focused query models. This project is a collaboration between Dr. Efron, his Google sponsor, and a Ph.D. student with the goal of advancing the state of the art in using structured...
FundingGoogle — $22,130
The Site-Based Data Curation (SBDC) project is a two-year effort to develop a framework of policies and processes for the curation of “site-based” digital research data that responds to the needs of long-tail science researchers and site managers, and promotes coordination with libraries and data repositories. The SBDC framework will be developed by experts in data curation, research library repositories, domain science, and research site management, providing a curation model that includes: 1) policies to infuse principled curation practices early in the data lifecycle; and 2) processes for curating cohesive aggregations of usable digital data for transfer from research sites to libraries and repositories. The SBDC framework is an important step forward in evolving the professional...
FundingInstitute of Museum and Library Services — $499,919
Films are produced, screened and perceived as part of a larger and continuously changing ecosystem that involves multiple stakeholders and themes. This project will measure the impact of social justice documentaries by capturing, modeling and analyzing the map of these stakeholders and themes in a systematic, scalable and analytically rigorous fashion. This solution will result in a validated, re-useable and end-user friendly methodology and technology that practitioners can use to assess the long-term impact of media productions beyond the number of people who have seen a screening or visited a webpage. Moreover, bringing the proposed computational methodology into a real-world application context can serve as a case-study for demonstrating the usability of this cutting-edge solution...
FundingFord Foundation — $150,000
This project will create both a master’s and doctoral-level specialization in Socio-technical Data Analytics (SODA). Partnerships with local researchers and businesses who already work with large data-sets will enable MS graduates to receive first-hand experience with both the social and technical implications of large digital data collections, and thus be well-prepared for leadership roles in academic and corporate environments. Similarly, doctoral students will consider multiple stages of the information lifecycle, which will help to ensure that their research findings will generalize to a range of scholarly and business practices.
FundingInstitute of Museum and Library Services — $498,777
Time affects information retrieval in many ways. Collections of documents change as new items are indexed. The content of documents themselves may change. Users submit queries at particular moments in time. And perhaps most importantly, people’s assessment of a document’s relevance to a query is often time-dependent. For example, searchers of news archives might seek information on a past event where relevant documents cluster in a window of time. Users of social media services such as Twitter demand topically relevant information that is new. People who monitor particular topics in the news (for example, editors of Wikipedia) take action when they find information that is topically relevant and that changes current knowledge. The traces of information created by change in documents,...
FundingNational Science Foundation — $408,908
The goal of this research is to help researchers develop and use relatively simple tools to describe species in a way that make those descriptions easier to share with other scientists and easier for computers to process and analyze. The approach is bottom-up and iterative, involving the rapid prototyping of tools, combining of existing tools, and the tailoring of applications developed for one purpose but now being reused for this scientific activity. Innovation from this project is applicable to the long-term development of open source software initiatives serving labs throughout the world. The project provides rich, real-world training for graduate students in library and information sciences, training them to be much needed cross-disciplinary researchers in a field desperate for...
FundingNational Science Foundation — $421,200
For over two millennia, librarians have played a critical role in the production and transmission of knowledge. They have helped to collect, catalogue, and curate a vast range of materials that constitute much of our cultural heritage - from epic poetry on papyrus scrolls to PDFs of scholarly articles. This project interrogates these practices by building a librarian's cabinet of curiosity, and populating it with explicit examples of the mundane activities that occur in and around the library.
FundingUniversity of Illinois Research Board — $6,500
This project examines the impact of different research funding structures on the training of future scientists, particularly graduate students and postdoctoral fellows, and the impact on their subsequent outcomes. Our proposed research begins by examining the way in which research (and most training) is funded and done. We classify projects by whether they are large or small scale (by funding size); multiple researchers; or multiple institutions. We construct different measures of project teams, and capture the subsequent trajectories of the students and postdoctoral fellows during and after their contact with the teams. We make use of a natural experiment and quasi experimental statistical techniques to separate the effect of funding structures from the other factors contributing to...
FundingNational Science Foundation — $175,918
Researchers rely on collections of books and other materials to support their scholarship. From these collections, scholars select, organize, and refine the worksets that will answer to their particular research objectives. The requirements for those worksets are becoming increasingly sophisticated and complex, both as humanities scholarship has become more interdisciplinary and as it has become more digital.
The HathiTrust Research Center (HTRC) is developing computational research access to some 10 million volumes (3 billion pages) to the HathiTrust corpus, a digital library of millions of books and other materials digitized by the Google Books project and other mass-digitization efforts. The HTRC is a collaborative research center launched jointly by Indiana University and...