Completed Projects


Like most XML applications, METS, the Metadata Encoding and Transmission Standard, overloads a small number of generic syntactic relationships (e.g., parent/child) to represent a variety of specific semantic relationships. Human beings correctly infer the meaning of METS markup, and these understandings inform the logic and design of applications that import, export, and transform METS-encoded resources and descriptions. However, METS's flexibility and generality invite diverse interpretations, posing challenges for processing across different METS profiles and local adaptations. Robust processing requires support in the form of a general software library for reasoning about METS documents. We describe the current state of development for such a library. This METS interpretation...


The use of library resources is stereotyped as a solitary activity, with hardly any mention in the library science and information retrieval literature on the social aspects of information systems. However, it is clear that end-users engage in significant collaboration; both with co-searchers, library staff and other interested parties. The skill of locating information is one that a growing number of people require but our knowledge of how to teach it remains rudimentary. In particular database systems fail to support both the learning of skills and the sharing of information.


Markup licenses inferences about a text. But the information warranting such inferences may not be entirely explicit in the syntax of the markup language used to encode the text. This paper describes a Prolog environment for exploring alternative approaches to representing facts and rules of inference about structured documents. It builds on earlier work proposing an account of how markup licenses inferences, and of what is needed in a specification of the meaning of a markup language. Our system permits an analyst to specify facts and rules of inference about domain entities and properties as well as facts about the markup syntax, and to construct and test alternative approaches to translation between representation layers. The system provides a level of abstraction at which the...


This project, a subaward from the University of Michigan, compares broadband development processes and outcomes across three leading domestic broadband initiatives: the federal government’s Broadband Technology Opportunities Program (BTOP) and Broadband Initiatives Program (BIP), and the experimental Google Fiber initiative. Working closely with the sponsoring organizations and a subset of project grantees, our study deploys a combination of ethnographic, survey-based, network analytic, and statistical methods to address three central questions:

  • What role do existing community resources and networks play in efforts to mobilize, secure funding for, and deploy high-speed broadband infrastructure?
  • What new or extended forms of social interaction are supported by...


Ford Foundation — $108,000

Against the backdrop of a powerful desire for national modernization, the Long 1960s (c. 1955-1975) witnessed attempts to build, literally, a better post-war Britain. The unprecedented burst of building activity that marked the post-war years included the planning and construction of hundreds of public library buildings, clothed in a variety of modernist styles symbolic of the period's spirit of renewal. Described at the time as a "national health service for reading," public libraries assumed a prominent position in the post-war welfare state. Through analysis of extant buildings and primary source documents, the research will examine what modernist library design meant to librarians, architects, local politicians and planners, and the public. The research will contribute to recent...


University of Illinois Research Board — $16,250

In this Early Career Development project, Williams used a social capital/social network model to research actual and potential IT use in six disadvantaged communities across Chicago. The research analyzed how people and communities are already using computers and the Internet, and how their own lives and identities might be represented as part of our nation's cyberinfrastructure.

Funding for this project was provided by the Institute of Museum and Library Services Laura Bush 21st Century Librarian Program.


Institute of Museum and Library Services — $199,796

This project allies with IMLS’s support of the Campaign for Grade-Level Reading with an exploration of the use of tablet computers, apps, and e-books in public libraries as a tool against summer reading loss. It will engage with experts in scholarship and practice to define the public library’s role in selecting and providing existing digital media for younger children, especially those primary-grades children in low-income communities who are most in need of intervention, whose access to media at home is limited, and for whom summer often means a loss of skills. This plan draws on both the historic involvement of public libraries in literacy through summer reading programs and ongoing support, and their long-term role as providers and facilitators for communities impacted by disparate...


Institute of Museum and Library Services — $46,678

“Scientific collections created and used in basic research are an integral part of the nation’s scientific infrastructure. They hold specimens of plants, animals, microbes, fossils, minerals and other artifacts that together comprise a national legacy of biological diversity”. (NSF Scientific Collections Survey, 2009). Individual specimens in these collections serve as the anchor for an expanding array of information that grows and changes with time about the specimen and the group that the specimen represents. Unfortunately, specimens and subsamples are scattered geographically across institutions. Taxonomic, genomic, geospatial, and other information about the specimens are also scattered across independent computer systems and on paper and are very difficult to access or synthesize...


National Science Foundation — $35,581

This project develops a freely available database that links Medline papers and U.S. patents, through identification of individuals who authored both papers and patents and analysis of citations between papers and patents. These patent-paper-author links will then enable identification of similar organizations and in some cases, science/technology field and geography. Co-authorship networks for scientists are also prepared, annotated and made available on the Dataverse Network System (DVN), analogously to what has been done for inventors in the patent record (NSF proposal 0830287). This integrated database enables researchers to investigate how 1) grants enable papers, 2) papers influence patenting, and 3) scientific knowledge ultimately diffuses and influences the entire patent record...


National Science Foundation — $445,165
Andrew W. Mellon Foundation — $10,000

The primary goal of the Data Curation Education Program (DCEP) is to design a program of graduate study that can serve as a model for training data curators (DCs) within the context of a larger LIS education. Secondarily, we intend to integrate this graduate training with ongoing research and practice to produce specialists that understand the research culture and can make substantive contributions to the mission of scientific, humanities, social science, and cultural heritage institutions and libraries.



Institute of Museum and Library Services — $852,502

Specific objectives are: 1. Develop and refine a humanities data curation curriculum. 2. Develop a network of internship sites at libraries, museums, digital archives and digital humanities centers. 3. Promote the role of LIS professionals in humanities data curation and expand the understanding of the role of digital data curation in humanities research. 4. Disseminate best practice reports and provide a model curriculum for other data curation programs. 5. Develop a model institute for delivering the curriculum as continuing professional development. 6. Deliver the curriculum to both new masters students and continuing professionals.



Institute of Museum and Library Services — $892,028

Funded by IFLA (The International Federation of Library Associations and Institutions), this research explores the variety of educational models for developing digital librarians around the world. Working with members of the IFLA Education and Training Section the results will be used to determine the feasibility of establishing international guidelines for educating digital librarians.



International Federation of Library Associations and Institutions

The Center for Informatics Research in Science and Scholarship at the University of Illinois will collaborate with the Maryland Institute for Technology in the Humanities at the University of Maryland and the Center for Digital Scholarship at Brown University to develop and conduct a series of advanced institutes on data curation for the digital humanities, to be held at the University of Illinois, Urbana-Champaign (Graduate School of Library and Information Science), the University of Maryland (Maryland Institute for Technology in the Humanities) and at Brown University (Center for Digital Scholarship).


National Endowment for the Humanities — $144,855

This project proposes to use supervised machine learning to build an entity extractor that is specifically designed for supporting the constructing of socio-technical network data. The resulting probabilistic prediction models and end-user technology are essential for being able to address substantive questions about real-world networks. The project team will make these outcomes publicly available to enable others to perform text coding projects, especially in the social sciences and humanities. We will also apply this extractor to multiple corpora for research projects.


Extreme Science and Engineering Discovery Environment

The goal of this multiple-phased research project is the development of a next-generation catalog prototype implementation with enhanced records for access to the folktale collection in the Center for Children's Books that gives special consideration to the shared and unique information seeking tasks of three distinct user groups: scholars, practitioners and laypeople. Bibliographic records for folktale resources frequently omit indicators of the rich, cultural heritage these items represent and provide only minimal access to their intellectual contents. Record enhancements may incorporate existing folktale classifications such as the Aarne-Thompson tale-type index and controlled vocabularies as well as current developments in cataloguing practices and standards such as FRBR (...


OCLC/ALISE Library and Information Science Research Grant Program — $15,000

Mak's book, "How the Page Matters," historicizes recent debates about eBooks and similar technologies by casting the page as an interface that has been under development since the scrolls of Antiquity. "How the Page Matters" tracks the page through the manuscripts of the Middle Ages, the printed books of the early modern period, and onto digital displays. By locating the page in a broader tradition of writing technologies, the book re-examines the print and digital 'revolutions' and shows that the questions raised by digital theorists about the visualization of information are not new, but are instead the persistent issues in a long history of graphic reproduction. "How the Page Matters" contends that the material of the page is constitutive of knowledge; the...


University of Illinois Research Board — $9,000

The creation of a searchable electronic database of the titles designated "first purchase" for children's library collections from 1909 to the present.


Inclusive Gigabit Libraries: Learn, Discuss and Brainstorm consists of an educational campaign to raise awareness of next generation networks and how libraries might participate in U.S. Ignite-related initiatives; at least six national forums for about 150 library leaders; development of at least five case studies; and a white paper that will synthesize the forums and case studies.

The primary goal of the US Ignite Partnership will be to catalyze approximately 60 advanced, next-generation applications over the next five years in six areas of national priority: education and workforce development, advanced manufacturing, health, transportation, public safety, and clean energy.


Institute of Museum and Library Services — $99,168

This research project developed a strategy for implementing knowledge management systems to support collaborative work in disparately located high-performing teams in a large complex organization. This study provided the Air Force Center for Engineering and Environment (AFCEE) with guidance to implement the ANSR knowledge management system using SharePoint. AFCEE is responsible for protecting the environment and provides an industry leading environmental sustainability program for the Air Force. In consideration of this, this study examined the various governance, organizational, technological and user acceptance factors and provides the following recommendations turn ANSR into a mature knowledge management resource.  The project also included the participation of four GSLIS graduate...


Language is an information system. We're building computer-based models of language evolution so that societies of autonomous agents can develop their own languages, and as a foundation for theories of dynamic information systems of all kinds (e.g. mutually co-adapting distributed subject indexes, website structure evolution).


This project partners NILRC along with ten libraries in community colleges in Illinois and Missouri to build a diverse professional workforce that understands community-based library staffing and service strategies as well as the challenges of serving a non-traditional, diverse, commuter-based student population. GSLIS will work with the University of Illinois College of Education to provide a varied curriculum. The partner libraries offer the students mentoring throughout the graduate program and for six months following graduation.



Institute of Museum and Library Services — $354,896

For all the developments in XML since 1998, one thing that has not changed is the understanding of XML documents as serializations of tree structures conforming to the constraints expressed in the document's schema. Notwithstanding XML's many strengths, there are problem areas which invite further research on some of the fundamental assumptions of XML and the document models associated with it. It is a challenge to represent in XML anything that does not easily lend itself to representation by context-free or constituent structure grammars, such as overlapping or fragmented elements, and multiple co-existing complete or partial alternative structures or orderings. For the purpose of our work, we call such structures complex structures, and we call documents containing such structures...

Gerard Salton is often credited with developing the vector space model (VSM) for information retrieval (IR). Citations to Salton give the impression that the VSM must have been articulated as an IR model sometime between 1970 and 1975. However, the VSM as it is understood today evolved over a longer time period than is usually acknowledged, and an articulation of the model and its assumptions did not appear in print until several years after those assumptions had been criticized and alternative models proposed. An often cited overview paper titled "A Vector Space Model for Information Retrieval" (alleged to have been published in 1975) does not exist, and citations to it represent a confusion of two 1975 articles, neither of which were overviews of the VSM as a model of information...


The “Music Information Retrieval Evaluation eXchange” (MIREX) is the annual cycle of events wherein music information retrieval (MIR) researchers come together to investigate how well their innovative MIR algorithms perform. MIREX has played a pivotal role in the growth and success of the MIR research community, evaluating over 1,068 algorithms across 23 unique MIR task categories. Notwithstanding its growth and success, the research landscape in which MIREX resides has evolved considerably over its lifetime. With developments in the field, MIREX is placed in a novel research context allowing it to undertake a process of reflection and regeneration in order to maximize its impact by building upon its strengths, ameliorating its shortcomings, and capitalizing on its new opportunities....


Andrew W. Mellon Foundation — $399,939

Mix IT Up! Youth Advocacy Librarianship focuses on creating intentionally structured, youth­-centered, engaged learning opportunities related to information technologies. Mix IT Up! enhances youth services by developing a library and information science (LIS) specialization that dovetails with community informatics and youth service in order to focus on systematically training librarians as youth advocates. Mix IT Up! actively recruits youth advocacy fellows from traditionally underrepresented groups in LIS - American Indian, Latino/a, African American, and working­ class - by providing academic and financial support. Fellows are placed in long-term youth advocacy projects that partner with community organizations serving youth. Fellows also engage with selected community affiliates...


Institute of Museum and Library Services — $904,314

Multi-Agent Systems research at the ISRL covers basic studies of multi-agent systems, including coordination models, computational organization theory, and multi-agent infrastructure. The Multi-Agent Systems Group is home of the MACE3J experimental platform.



The Networked Environment for Music Analysis (NEMA) project is a multinational, multidisciplinary cyberinfrastructure project for music information processing that builds upon and extends the music information retrieval research being conducted by the International Music Information Retrieval Systems Evaluation Laboratory (IMIRSEL) at the University of Illinois at Urbana-Champaign (UIUC). NEMA brings together the collective projects and the associated tools of six world leaders in the domains of music information retrieval (MIR), computational musicology (CM) and e-humanities research. NEMA provides an open and extensible webservice-based resource framework that facilitates the integration of music data and analytic/evaluative tools that can be used by the global MIR and CM research...


Andrew W. Mellon Foundation — $1,200,000

This project assesses an American Library Association (ALA) “News Know-how” program, which engages librarians, journalists, news ethicists and students across the country in news literacy education.

The evaluation will provide information that will help the ALA and its partners adjust the strategy for delivering this program as well as provide a final evaluation of the overall impact of the program. The evaluation addresses the following questions:


American Library Association — $89,697



The overarching goals of the Open Annotation Collaboration (OAC) are to facilitate to emergence of a Web and resource-centric interoperable annotation environment that allows leveraging annotations across the boundaries of annotation clients, annotation servers, and content collections, to demonstrate the utility of this environment, and to see widespread adoption of this environment. To this end the OAC has made available the draft annotation data model and ontology developed during Phase I. OAC Phase II focuses on directly engaging humanities scholars and involving existing collections of digital content that have well-defined communities of scholars interested in annotating such content.



Andrew W. Mellon Foundation — $362,000
Andrew W. Mellon Foundation — $170,000

The overarching goals of the Open Annotation Collaboration (OAC) are to facilitate to emergence of a Web and resource-centric interoperable annotation environment that allows leveraging annotations across the boundaries of annotation clients, annotation servers, and content collections, to demonstrate the utility of this environment, and to see widespread adoption of this environment. To this end the OAC has made available the draft annotation data model and ontology developed during Phase I. OAC Phase II focuses on directly engaging humanities scholars and involving existing collections of digital content that have well-defined communities of scholars interested in annotating such content.



Andrew W. Mellon Foundation — $673,944

Outside of conventional classes, outside of schools and universities, how do people learn things? Often they ask a colleague to help show them what to do. It sounds obvious, but all our work on interface design, help systems, manuals and even training seems to ignore it. What would systems be like if they actively tried to support this process? That is what this research tries to address.


National Science Foundation — $442,184

This project will enhance the GSLIS doctoral program by building a stronger research community within the school for the study of information in society, including policy, economic, and historical dimensions. Project goals include enhancing the doctoral program curriculum; connecting the research community to the wider world of librarianship; and attracting and supporting thirteen diverse students, especially those from underrepresented groups, with a specific focus on recruiting doctoral students who will teach master's students capable of becoming future leaders in public, academic, and school libraries.


Institute of Museum and Library Services — $990,234

Interactive media are highly complex and at high risk for loss as technologies rapidly become obsolete. The Preserving Virtual Worlds project will explore methods for preserving digital games and interactive fiction. Major activities will include developing basic standards for metadata and content representation and conducting a series of archiving case studies for early video games, electronic literature and Second Life, an interactive multiplayer game. Second Life content participants include Life to the Second Power, Democracy Island and the International Spaceflight Museum. Partners include the University of Maryland, Stanford University, Rochester Institute of Technology and Linden Lab.



The original Preserving Virtual Worlds project, funded by the Library of Congress’s National Digital Information Infrastructure and Preservation Program (NDIIP), investigated what preservation issues arose with computer games and interactive fiction, and how existing metadata and packaging standards might be employed for the long-term preservation of these materials. PVW2 will focus on determining properties for a variety of educational games and game franchises in order to provide a set of best practices for preserving the materials through virtualization technologies and migration, as well as provide an analysis of how the preservation process is documented.



Institute of Museum and Library Services — $785,898

Project Bamboo aims to create research environments for humanities scholars. The University of Illinois at Urbana-Champaign will support the Bamboo Phase I Technology Project by:
1. Implementing Proxied Software Environment for the Advancement of Scholarly Research Services (SEASR ) Analytics as a Scholarly Service on the Bamboo Services Platform. This will entail modifying existing SEASR tools and services to interoperate with the Bamboo Services Platform. This work will help inform the development of the platform.


Andrew W. Mellon Foundation — $93,150

Temporal and social dynamics of the quality and reliability of information and information systems. How do information systems improve, fail, and fit in their social contexts? How does information quality evolve in large information bases?


Description of structural and semantic relationships and properties of, within, and between resources is seen as a key issue in digital preservation. But the markup languages used to encode descriptions for migration between and storage within digital repositories are subject to the same interpretive problems that complicate other uses of markup. This paper reports on a project that aims to address these problems by explicating facts that otherwise would not support automated inferencing. These facts are expressed as RDF (Resource Description Framework) triples, stored in and retrieved from a scalable RDF-based repository.



Eight scholarships will be offered over three years to outstanding and diverse students admitted to the Certificate of Advanced Studies (CAS) program. This program will provide continuing education by offering outstanding library practitioners the opportunity to continue their education in a topic related to youth services, and by providing institutional support for these students to develop continuing education workshops for others. The two central goals of this project are: 1) leaders in the youth services library profession will provide quality continuing education for their practitioner peers in school and public libraries, and 2) leaders in the youth services library profession will meaningfully contribute to best practices and research in this field.



Institute of Museum and Library Services — $364,925

Films are produced, screened and perceived as part of a larger and continuously changing ecosystem that involves multiple stakeholders and themes. This project will measure the impact of social justice documentaries by capturing, modeling and analyzing the map of these stakeholders and themes in a systematic, scalable and analytically rigorous fashion. This solution will result in a validated, re-useable and end-user friendly methodology and technology that practitioners can use to assess the long-term impact of media productions beyond the number of people who have seen a screening or visited a webpage. Moreover, bringing the proposed computational methodology into a real-world application context can serve as a case-study for demonstrating the usability of this cutting-edge solution...


Ford Foundation — $150,000

SEASR, subawarded through Stanford University, fosters collaboration by empowering scholars to share data and research in virtual work environments. This eases scholars’ access to digital research materials, which currently are stored in a variety of incompatible formats.


Andrew W. Mellon Foundation — $359,860

The Sowing Seeds project will establish a new community technology center (CTC) in Danville, Illinois, and expand basic training to this and four existing CTCs in Champaign-Urbana and East St. Louis, Illinois. Basic skills are just the gateway, however. The grant will allow for expansion of our advanced digital media training focused on the development of skills necessary to meet the NETS standards of the International Society for Technology in Education (ISTE).


Illinois Department of Commerce & Economic Opportunity — $116,457

This proposal seeks add-on funding for a two-year Ford Foundation study begun in fall 2010: SIBR, or Statewide Illinois Broadband Research. The research uses the theory and methods of community informatics to ask: How will high-speed internet, specifically the federal broadband projects funded by the 2009 Recovery Act, impact society? We aim to find out: Is this public policy working? How? At the highest level of abstraction, the study asks: How is society transforming into an information society? What are the successes, and what problems remain?


University of Illinois Research Board — $8,794

The goal of this research study is to examine the social and economic impact of the Urbana-Champaign Big Broadband (“UC2B”) project.  The funding will support the analysis and reporting of the social and economic impact of the adoption of broadband services provided by the UC2B program and the development of a data archive to organize all of the data used to perform the analysis and reporting.


Partnership for a Connected Illinois — $57,009

Structural analysis of music (formal analysis) is one of the most fundamental analyses performed by music researchers, usually preceding any other types of analysis because it provides the overall view of the piece. Its importance is reflected by the fact that the course on formal analysis is often one of the core music undergraduate music curricula with several major textbooks on the subject. The main goal of formal analysis is to find similar sections within a piece of music and labeling these section such as, ABA and ABCB'A or with further analysis these sections can be marked with predefined labels such as Intro, Verse, Verse, Bridge, Chorus, Verse, and Outro (popular music) or Introduction, Exposition, Development, Recapitulation, and Coda (sonata form). Thus, the formal analysis...


National Science Foundation — $99,476

How might we use advanced networked technologies in museums? How might they be used to improve the experience for visitors to the museum? We are investigating these questions through a careful analysis of what currently happens in museums and how we might want to build on or change that. We think that much can be learned from studying the kind of things that docents do when they give a guided tour. We are not proposing that museums should replace docents, but rather that they should serve as a starting point for considering the kinds of functionalities that computer systems should be provided with. This approach tries to avoid the technology-driven obsession of much innovative use of advanced technologies and instead to explore the design space of possibilities with a stronger focus on...


Vast quantities of electronic information provide a unique opportunity for scientists to identify candidate solutions for grand challenges as scientists, policy makers, and students have never had access to more electronic information than they do today. The goal in this research is to develop new text mining methods that are consistent with the manual processes that experts currently used to resolve contradictory and redundant evidence. Both discovery and synthesis are difficult activities even for people, so a socio-technical strategy will be required to achieve this goal.



National Science Foundation — $449,317

Jon Gant directs a canvassing operation and research team to assess and evaluate the implementation, progress and overall adoption success of the UC2B project. Researchers hope to deepen the understanding of the barriers to broadband adoption among residents of underserved communities. Gant's team has been integral to identifying households that are eligible for free installation, and then working with city agencies and local businesses to connect the households to fiber-optic broadband service. The Broadband Operations project has experience across the digital spectrum, from those completely unfamiliar with digital applications, to others who utilize the Internet for education, entertainment, employment, or citizen activism, among a range of choices. The adoption team is on the...


National Telecommunications and Information Administration — $450,000

Web-based Information Science Education (WISE) is a unique and groundbreaking opportunity in online education. Leading library and information science schools have extended their reach on a global basis to broaden the educational opportunities available to students. WISE uses advanced online technology to enrich education and foster relationships among students, faculty, and universities. The vision of this initiative is to provide a collaborative distance education model that will increase the quality, access, and diversity of online education opportunities in library and information science.



Institute of Museum and Library Services — $257,427

Researchers rely on collections of books and other materials to support their scholarship. From these collections, scholars select, organize, and refine the worksets that will answer to their particular research objectives. The requirements for those worksets are becoming increasingly sophisticated and complex, both as humanities scholarship has become more interdisciplinary and as it has become more digital.

The HathiTrust Research Center (HTRC) is developing computational research access to some 10 million volumes (3 billion pages) to the HathiTrust corpus, a digital library of millions of books and other materials digitized by the Google Books project and other mass-digitization efforts. The HTRC is a collaborative research center launched jointly by Indiana University and...


Andrew W. Mellon Foundation — $436,525