Algorithmic Contract Design for Crowdsourced Ranking: Conclusions, Future Directions, and References

3 May 2024

This paper is available on arxiv under CC 4.0 license.


(1) Kiriaki Frangias;

(2) Andrew Lin;

(3) Ellen Vitercik;

(4) Manolis Zampetakis.

Abstract and Introduction

Warm-up: Agents with known equal disutility

Agents with unknown disutilites


Conclusions and future directions, References

A Summary of notation

B Omitted proofs from Section 2

C Omitted proofs from Section 3

D Additional Information about Experiments

5 Conclusions and future directions

We studied the multifaceted problem of designing a crowdsourcing mechanism that efficiently and accurately ranks a set of items using pairwise comparisons from rational agents. We based our approach on the classic principal-agent model from contract theory. To distribute comparisons among agents, we used an unexpected connection to the social golfer problem which allowed us to simultaneously minimize the agents’ workload and ensure enough agents evaluate each comparison.

We showed that by optimizing the payment mechanisms, the principal can incentivize enough agents to exert effort, ensuring our algorithm recovers the ground-truth ranking. Our experiments showed that even when noise is added to our original model, our method consistently aids the principal in retrieving the accurate ground-truth ordering.

llude, which allowed us to assign the same set of verified comparisons to all agents. How should our mechanism change if agents can collude? Moreover, we assumed that agents who do not exert effort will return comparisons consistent with a uniformly-random permutation of the items. What other models of agent behavior can we study under this framework? Finally, as is standard in contract theory, we assumed that the principal knows the probability π that an effortful agent is good. How should we handle the case where π is unknown?


[1] Abolfazl Asudeh, Gensheng Zhang, Naeemul Hassan, Chengkai Li, and Gergely V. Zaruba. Crowdsourcing pareto-optimal object finding by pairwise comparisons. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM), 2015.

[2] Ralph Allan Bradley and Milton E Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345, 1952.

[3] Yang Cai, Constantinos Daskalakis, and Christos Papadimitriou. Optimum statistical estimation with strategic data sources. In Conference on Learning Theory (COLT), 2015.

[4] Xi Chen, Paul N Bennett, Kevyn Collins-Thompson, and Eric Horvitz. Pairwise ranking aggregation in a crowdsourced setting. In Proceedings of the International Conference on Web Search and Data Mining, pages 193–202, 2013.

[5] Yuxin Chen and Changho Suh. Spectral MLE: Top-k rank aggregation from pairwise comparisons. In International Conference on Machine Learning (ICML), 2015.

[6] Paul Duetting, Tomer Ezra, Michal Feldman, and Thomas Kesselheim. Combinatorial contracts. In Symposium on Foundations of Computer Science (FOCS), 2022.

[7] Anindya Ghose, Panagiotis G Ipeirotis, and Beibei Li. Designing ranking systems for hotels on travel search engines by mining user-generated and crowdsourced content. Marketing Science, 31(3):493–520, 2012.

[8] John Guiver and Edward Snelson. Bayesian inference for plackett-luce ranking models. In International Conference on Machine Learning (ICML), 2009.

[9] Stephen Guo, Aditya Parameswaran, and Hector Garcia-Molina. So who won? dynamic max discovery with the crowd. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, SIGMOD ’12, page 385–396, New York, NY, USA, 2012. Association for Computing Machinery. ISBN 9781450312479. doi: 10.1145/2213836.2213880. URL

[10] Guru Guruganesh, Jon Schneider, and Joshua Wang. Contracts under moral hazard and

adverse selection. In ACM Conference on Economics and Computation (EC), 2021.

[11] R. Jurca and B. Faltings. Mechanisms for making crowds truthful. Journal of Artificial Intelligence Research, 34:209–253, mar 2009. doi: 10.1613/jair.2621. URL

[12] Radu Jurca and Boi Faltings. Robust incentive-compatible feedback payments. In Workshop on Trading Agent Design and Analysis, 2006.

[13] Radu Jurca and Boi Faltings. Minimum payments that reward honest reputation feedback. In ACM Conference on Economics and Computation (EC), 2006.

[14] Jiawen Kang, Zehui Xiong, Dusit Niyato, Dongdong Ye, Dong In Kim, and Jun Zhao. Toward secure blockchain-enabled internet of vehicles: Optimizing consensus management using reputation and contract theory. IEEE Transactions on Vehicular Technology, 68(3):2906–2920, 2019.

[15] Ngai Meng Kou, Yan Li, Hao Wang, Leong Hou U, and Zhiguo Gong. Crowdsourced top-k queries by confidence-aware pairwise judgments. Proceedings of the 2017 ACM International Conference on Management of Data, 2017. URL

[16] Jean-Jacques Laffont and David Martimort. Moral Hazard: The Basic Trade-Offs, pages 145–186. Princeton University Press, 2002. URL

[17] Shili Lin. Rank aggregation methods. Wiley Interdisciplinary Reviews: Computational Statistics, 2(5):555–570, 2010.

[18] Tyler Lu and Craig Boutilier. Learning mallows models with pairwise preferences. In International Conference on Machine Learning (ICML), 2011.

[19] C. L. Mallows. Non-null ranking models. i. Biometrika, 44:114–130, 1957. URL

[20] Fei Mi and Dit-Yan Yeung. Probabilistic graphical models for boosting cardinal and ordinal peer grading in moocs. In AAAI Conference on Artificial Intelligence, 2015.

[21] Chris Piech, Jonathan Huang, Zhenghao Chen, Chuong Do, Andrew Ng, and Daphne Koller. Tuned models of peer assessment in moocs. In Proceedings of The 6th International Conference on Educational Data Mining (EDM), 2013.

[22] R. L. Plackett. The Analysis of Permutations. Journal of the Royal Statistical Society Series C, 24(2):193–202, June 1975. doi: 10.2307/2346567. URL

[23] Karthik Raman and Thorsten Joachims. Methods for ordinal peer grading. In International Conference on Knowledge Discovery and Data Mining (KDD), 2014.

[24] Karthik Raman and Thorsten Joachims. Bayesian ordinal peer grading. In Proceedings of the ACM Conference on Learning at Scale, 2015.

[25] Vikas C Raykar, Shipeng Yu, Linda H Zhao, Gerardo Hermosillo Valadez, Charles Florin, Luca Bogoni, and Linda Moy. Learning from crowds. Journal of machine learning research, 11(4), 2010.

[26] Nihar Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, and Martin Wainwright. Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence. In International Conference on Artificial Intelligence and Statistics (AISTATS), 2015.

[27] Nihar B Shah, Joseph K Bradley, Abhay Parekh, Martin Wainwright, and Kannan

Ramchandran. A case for ordinal peer-evaluation in moocs. In NIPS Workshop on

Data-Driven Education, 2013.

[28] Louis Leon Thurstone. The method of paired comparisons for social values. The Journal of Abnormal and Social Psychology, 21:384–400, 1927. URL

[29] Markus Triska. Solution Methods for the Social Golfer Problem. PhD thesis, Vienna University of Technology, March 2008. Available from

[30] Kristi Tsukida, Maya R Gupta, et al. How to analyze paired comparison data. Department of Electrical Engineering University of Washington, Tech. Rep. UWEETR-2011-0004, 1, 2011.

[31] Bo Waggoner and Yiling Chen. Output agreement mechanisms and common knowledge. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 2(1): 220–226, Sep. 2014. doi: 10.1609/hcomp.v2i1.13151. URL

[32] Banghua Zhu, Stephen Bates, Zhuoran Yang, Yixin Wang, Jiantao Jiao, and Michael I. Jordan. The sample complexity of online contract design. In ACM Conference on Economics and Computation (EC), 2023.

This paper is available on Arxiv under CC 4.0 license.