Working Paper
Alstad Z, Dahlstrom-Hakki I, Asbell-Clarke J, Rowe E, Altman M. The Use of Multidimensional Biopsychological Markers to Detect Learning in Educational Gaming Environments. Working Paper.Abstract

This project explores how multidimensional bio-psychological measures are used to understand the cognitive aspects of student learning in STEM (Science, Technology, Engineering and Math) focused educational games. Furthermore, we seek to articulate a method for how learning events can be automatically analyzed using these tools. Given the complexity and difficulty of finding externalized markers of learning as it happens, it is evident that more robust measures could benefit this process. The work reported here, with funding from National Science Foundation grant (NSF DRL-1417456), aims to incorporate more diverse measures of behavior and physiology in order to create a more complete assessment of learning and cognition in a game based environment. Tools used in this project include eye tracking systems, heart rate sensors, as well as tools for detecting electrodermal activity (EDA), temperature and movement data. Findings indicated both the utility of more varied measures as well as the need for more precise tools for synchronization of diverse data streams.

O'Brien D, Ullman J, Altman M, Gasser U, Bar-Sinai M, Nissim K, Vadhan S, Wojcik MJ, Wood A. When is Information Purely Public?. Social Science Research Network [Internet]. Working Paper. Publisher's VersionAbstract
Researchers are increasingly obtaining data from social networking websites, publicly-placed sensors, government records and other public sources. Much of this information appears public, at least to first impressions, and it is capable of being used in research for a wide variety of purposes with seemingly minimal legal restrictions. The insights about human behaviors we may gain from research that uses this data are promising. However, members of the research community are questioning the ethics of these practices, and at the heart of the matter are some difficult questions about the boundaries between public and private information. This workshop report, the second in a series, identifies selected questions and explores issues around the meaning of “public” in the context of using data about individuals for research purposes.
Altman M, Amos B, McDonald MP, Smith D. Revealing Preferences: Why Gerrymanders are Hard to Prove, and What to Do about It. Social Science Research Network [Internet]. Working Paper. Publisher's VersionAbstract

Gerrymandering requires illicit intent. We classify six proposed methods to infer the intent of a redistricting authority using a formal framework for causal inferences that encompasses the redistricting process from the release of census data to the adoption of a final plan. We argue all proposed techniques to detect gerrymandering can be classified within this formal framework. Courts have, at one time or another, weighed evidence using one or more of these methods to assess racial or partisan gerrymandering claims. We describe the assumptions underlying each method, raising some heretofore unarticulated critiques revealed by laying bare their assumptions. We then review how these methods were employed in the 2014 Florida district court ruling that the state legislature violated a state constitutional prohibition on partisan gerrymandering, and propose standards that advocacy groups and courts can impose upon redistricting authorities to ensure they are held accountable if they adopt a partisan gerrymander.

Altman M, Magar E, McDonald MP, Trelles A. The Effects of Automated Redistricting and Partisan Strategic Interaction on Representation: The Case of Mexico. Social Science Research Network [Internet]. Working Paper. Publisher's VersionAbstract
In the U.S. redistricting is deeply politicized and often synonymous with gerrymandering -- the manipulation of boundaries to promote the goals of parties, incumbents, and racial groups. In contrast, Mexico’s federal redistricting has been implemented nationwide since 1996 through automated algorithms devised by the electoral management body (EMB) in consultation with political parties. In this setting, parties interact strategically and generate counterproposals to the algorithmically generated plans in a closed-door process that is not revealed outside the bureaucracy. Applying geospatial statistics and large-scale optimization to a novel dataset that has never been available outside of the EMB, we analyze the effects of automated redistricting and partisan strategic interaction on representation. Our dataset comprises the entire set of plans generated by the automated algorithm, as well as all the counterproposals made by each political party during the 2013 redistricting process. Additionally, we inspect the 2006 map with new data and two proposals to replace it towards 2015 in search for partisan effects and political distortions. Our analysis offers a unique insight into the internal workings of a purportedly autonomous EMB and the partisan effects of automated redistricting on representation.
Okulicz-Kozaryn A, Altman M. The Energy Paradox: Energy Use and Happiness. ARIQ. 2019.Abstract

It is widely claimed that there is a substantial tradeoff between energy preservation and human wellbeing. We are reluctant to cut energy consumption for fear of decline in our happiness. Despite technological advances, Earth’s per capita energy use continues to grow. The environmental consequences are well known: resource depletion, pollution, and global warming. Here we studied the relationship between energy consumption and happiness across four decades, and multiple levels of geography. Surprisingly, we found that received wisdom is false–for counties, states and nations, energy consumption is neither necessary for wellbeing, nor linked directly to it. The relation between energy use and happiness is very similar to the relation between economic growth and happiness, i.e., the Easterlin Paradox.

The Public Mapping ProjectHow Public Participation Can Revolutionize Redistricting
McDonald M, Altman M. The Public Mapping ProjectHow Public Participation Can Revolutionize Redistricting. Cornell University Press; 2018. Publisher's VersionAbstract

The Laurence and Lynne Brown Democracy Medal is an initiative of the McCourtney Institute for Democracy at Pennsylvania State University. It annually recognizes outstanding individuals, groups, and organizations that produce exceptional innovations to further democracy in the United States or around the world.

Micah Altman and Michael P. McDonald unveil the Public Mapping Project, which developed DistrictBuilder, an open-source software redistricting application designed to give the public transparent, accessible, and easy-to-use online mapping tools. As they show, the goal is for all citizens to have access to the same information that legislators use when drawing congressional maps—and use that data to create maps of their own.

Differential Privacy: A Primer for a Non-Technical Audience
Wood A, Altman M, Bembenek A, Bun M, Gaboardi M, Honaker J, O'Brien DR, Steinke T, Vadhan S. Differential Privacy: A Primer for a Non-Technical Audience. Vanderbilt Journal of Entertainment and Technology Law (JETlaw) [Internet]. 2018;21 :209-276. Publisher's VersionAbstract

ifferential privacy is a formal mathematical framework for quantifying and managing privacy risks. It provides provable privacy protection against a wide range of potential attacks, including those currently unforeseen. Differential privacy is primarily studied in the context of the collection, analysis, and release of aggregate statistics. These range from simple statistical estimations, such as averages, to machine learning. Tools for differentially private analysis are now in early stages of implementation and use across a variety of academic, industry, and government settings. Interest in the concept is growing among potential users of the tools, as well as within legal and policy communities, as it holds promise as a potential approach to satisfying legal requirements for privacy protection when handling personal information. In particular, differential privacy may be seen as a technical solution for analyzing and sharing data while protecting the privacy of individuals in accordance with existing legal or policy requirements for de-identification or disclosure limitation.



A Grand Challenges-Based Research Agenda for Scholarly Communication and Information Science
Altman M, Bourg C, Cohen P, Choudhury GS, Henry C, Kriegsman S, Minow M, Selematsela D, Sengupta A, Suber P, et al. A Grand Challenges-Based Research Agenda for Scholarly Communication and Information Science.; 2018. Publisher's VersionAbstract
The “Grand Challenges-Based Research Agenda for Scholarly Communication and Information Science” describes a vision for a more inclusive, open, equitable, and sustainable future for scholarship; characterizes the central technical, organizational, and institutional barriers to this future; describes the areas research needs to advance this future; and identifies targeted “grand challenge” research problems for knowledge generation. These “grand challenges” are fundamental research problems with broad applications, whose solutions are potentially achievable within the next decade.
Chassanoff A, Borghi J, AlNoamany Y, Thornton K. Software Curation in Research Libraries: Practice and Promise. he Journal of Librarianship and Scholarly Communication. 2018;Forthcoming.Abstract

INTRODUCTION. Research software plays an increasingly vital role in the scholarly record. Academic research libraries are in the early stages of exploring strategies for curating and preserving research software, aiming to provide long-term access and use. DESCRIPTION OF PROGRAM. In 2016, the Council on Library and Information Resources (CLIR) began offering postdoctoral fellowships in software curation. Four institutions hosted the initial cohort of software curation fellows. This article describes the work activities and research program of the cohort, highlighting the challenges and benefits of doing this exploratory work in research libraries. NEXT STEPS. Academic research libraries are poised to play an important role in research and development around robust services for software curation. The next cohort of CLIR fellows are set to begin in fall 2018 and will likely shape and contribute substantially to an emergent research agenda.

How big data challenges privacy, and how science can help. The Washington DC 100 [Internet]. 2018;May 8. Publisher's VersionAbstract
The collection of personal information has become broader and more threatening than anyone could have imagined. Our research finds traditional approaches to safeguarding privacy are stretched to the limit as thousands of data points are collected about us every day and maintained indefinitely by a host of technology platforms.
Altman M, Cohen A, Fluitt A, Nissim K, Washington M, Wood A. Comments on new techniques and Methodologies for Combining Data From Multiple Source. Office of Management and Budget. 2018.Abstract

Comments in response to  Request for information,

New techniques and methodologies based on combining data from multiple sources

Altman M, Wood A. How big data challenges privacy, and how science can help. Washingto DC 100 [Internet]. 2018;May. Publisher's VersionAbstract

The collection of personal information has become broader and more threatening than anyone could have imagined. Our research finds traditional approaches to safeguarding privacy are stretched to the limit as thousands of data points are collected about us every day and maintained indefinitely by a host of technology platforms.

Altman M, Vayena E, Wood A. A Harm-Reduction Framework for Algorithmic Fairness. IEEE Privacy and Security. 2018;Forthcoming.Abstract

In this article we recognize the profound effects that algorithmic decision-making can have on people’s lives and proposes a harm-reduction framework for algorithmic fairness. We argue that any evaluation of algorithmic fairness must take into account the foreseeable effects that algorithmic design, implementation, and use have on the well-being of individuals. We further demonstrate how counterfactual frameworks for causal inference developed in statistics and computer science can be used as the basis for defining and estimating the foreseeable effects of algorithmic decisions. Finally, we argue that certain patterns of foreseeable harms are unfair. An algorithmic decision is unfair if it imposes predictable harms on sets of individuals that are unconsciously disproportionate to the benefits these same decisions produce elsewhere. Also, an algorithmic decision is unfair when it is regressive, i.e., when members of disadvantaged groups pay a higher cost for the social benefits of that decision.

Hellyar D, Walsh R, Altman M. Improving digital experience through modeling the human experience: The resurgence of ‘Virtual’- (& ‘Augmented’- & ‘Mixed’-) Reality. In: Reconceptualizing Libraries. Routledge Press ; 2018.Abstract

This essay is designed generally to introduce information professionals and researchers to the topic of VR, to characterize its potential to enhance human experiences, and to identify the concepts that are critical to its application. The essay is also intended specifically for professional librarians, and applied library information science researchers, who aim to integrate new interface technologies and design concepts into library systems.

Nissim K, Steinke T, Wood A, Altman M, Bembenek A, Bun M, Gaboardi M, O'Brien DR, Vadhan S. Differential Privacy: A Primer for a Non-Technical Audience. Vanderbilt Journal of Entertainment and Technology Law. 2018;Forthcoming.Abstract

Differential privacy is a formal mathematical formal mathematical framework for guaranteeing privacy protection when analyzing or releasing statistical data. Recently emerging from the theoretical computer science literature, differential privacy is now in initial stages of implementation and use in various academic, industry, and government settings.

This document is a primer on differential privacy. Using intuitive illustrations and limited mathematical formalism, this primer provides an introduction to dierential privacy for non-technical practitioners, who are increasingly tasked with making decisions with respect to dierential privacy as it grows more widespread in use. In particular, the examples in this document illustrate ways in which social science and legal audiences can conceptualize the guarantees provided by differetial privacy with respect to the decisions they make when managing personal data about research subjects and informing them about the privacy protection they will be afforded.

Altman M, Wood A, O'Brien D, Gasser U. Practical Approaches to Big Data Privacy Over Time. International Journal of Data Privacy Law [Internet]. 2018. Earlier versionAbstract

Increasingly, governments and businesses are collecting, analyzing, and sharing detailed information about individuals over long periods of time. Vast quantities of data from new sources and novel methods for large-scale data analysis promise to yield deeper understanding of human characteristics, behavior, and relationships and advance the state of science, public policy, and innovation. At the same time, the collection and use of fine-grained personal data over time is associated with significant risks to individuals, groups, and society at large. In this article, we examine a range of longterm data collections, conducted by researchers in social science, in order to identify the characteristics of these programs that drive their unique sets of risks and benefits. We also examine the practices that have been established by social scientists to protect the privacy of data subjects in light of the challenges presented in long-term studies. We argue that many uses of big data, across academic, government, and industry settings, have characteristics similar to those of traditional long-term research studies. In this article, we discuss the lessons that can be learned from longstanding data management practices in research and potentially applied in the context of newly emerging data sources and uses.

Altman M, McDonald M. Why redistricting should not be left to a mathematical formula alone. LSE US Centre [Internet]. 2017. Publisher's VersionAbstract
new research, Micah Altman and Michael P. McDonald find that there are limitations to such a formula based approach, especially given that here is no consensus on which one is a good measure of representation. Instead, they propose that formulas are used alongside open and transparent systems that support public participation in the redistricting process.
Gallinger M, Bailey J, Cariani K, Owens T, Altman M. Trends in Digital Preservation Capacity and Practice: Results from the 2nd Bi-annual National Digital Stewardship Alliance Storage Survey. D-Lib [Internet]. 2017;23 (7/8). Publisher's VersionAbstract

Research and practice in digital preservation requires a solid foundation of evidence of what is being protected and what practices are being used. The National Digital Stewardship Alliance (NDSA) storage survey provides a rare opportunity to examine the practices of most major US memory institutions. The repeated, longitudinal design of the NDSA storage surveys offer a rare opportunity to more reliably detect trends within and among preservation institutions rather than the typical surveys of digital preservation, which are based on one-time measures and convenience (Internet-based) samples. The survey was conducted in 2011 and in 2013. The results from these surveys have revealed notable trends, including continuity of practice within organizations over time, growth rates of content exceeding predictions, shifts in content availability requirements, and limited adoption of best practices for interval fixity checking and the Trusted Digital Repositories (TDR) checklist. Responses from new memory organizations increased the variety of preservation practice reflected in the survey responses.


Castro E, Crosas M, Garnett A, Sheridan K, Altman M. Evaluating and Promoting Open Data Practices in Open Access Journals. Journal of Scholarly Publishing. 2017;Forthcoming.Abstract

In the last decade there has been a dramatic increase in attention from the scholarly communications and research community to open access (OA) and open data practices. These are potentially related, because journal publication policies and practices both signal disciplinary norms, and provide direct incentives for data sharing and citation. However, there is little research evaluating the data policies of OA journals. In this study, we analyze the state of data policies in open access journals, by employing random sampling of the Directory of Open Access Journals (DOAJ) and Open Journal Systems (OJS) journal directories, and applying a coding framework that integrates both previous studies and emerging taxonomies of data sharing and citation. This study, for the first time, reveals both the low prevalence of data sharing policies and practices in OA journals, which differs from the previous studies of commercial journals’ in specific disciplines.