COMP 5704: Parallel Algorithms and Applications in Data Science
School of Computer Science
Carleton University, Ottawa, Canada
Parallel Genetic Algorithm for GitHub Projects Recommendation
Name: Lance Wang (#101144671)
In the era of big data, parallel computing has been widely used to overcome the compu- tational barriers. Techniques of multicore processors, cloud computing and peer-to-peer (P2P) are becoming mature, which enable many researchers and developers to obtain vast computational resources, which made evolutionary computing (EC, Fogel:107769) became the popular area that is widely used to solve and optimise solutions for different problems. Genetic algorithm (GA) has been a popular solution finding and optimising approach in computer science. Parallel Genetic Algorithm (PGA) is also one of the trending research area since GA can be parallelized and distributed straightforwardly.
Since GitHub was launched on 2008, tremendous developers have devoted themselves to this open-source software (OSS) platform for creating countless state-of-the-art software and techniques. Gradually, developers not only build professional skills (i.e. expertise) but also communities (i.e. social networks) inside GitHub. However, as the number of projects (or called repositories) and users are increasing dramatically, finding new projects to contribute is no longer as easy as it was. Therefore, in this research, we aim to build a GitHub repositories recommender system by using PGA to find next appropriate projects for developers based on their expertise and social networks.
- E. Alba. Parallel metaheuristics: a new class of algorithms, volume 47. John Wiley & Sons, 2005.
- E. Alba, C. Blum, P. Asasi, C. Leon, and J. A. Gomez. Optimization techniques for solving complex problems, volume 76. John Wiley & Sons, 2009.
- L. J. Fogel, A. J. Owens, and M. J. Walsh. Artificial intelligence through simulated evolution. Wiley, New York, NY, 1966. URL https://cds.cern.ch/record/107769.
- G. Gousios. The ghtorrent dataset and tool suite. In Proceedings of the 10th Working Conference on Mining Software Repositories, MSR ?13, pages 233?236, Piscataway, NJ, USA, 2013. IEEE Press. ISBN 978-1-4673-2936-1. URL http://dl.acm.org/ citation.cfm?id=2487085.2487132.
- T. Harada and E. Alba. Parallel genetic algorithms: A useful survey. ACM Comput. Surv., 53(4), Aug. 2020. ISSN 0360-0300. doi: 10.1145/3400031. URL https://doi. org/10.1145/3400031.
- H. M. Harmanani, S. Bou Ghosn, and F. Drouby. A parallel genetic algorithm for the open-shop scheduling problem using deterministic and random moves. 2016.
- E. L. Lawler, J. K. Lenstra, A. H. R. Kan, and D. B. Shmoys. Sequencing and scheduling: Algorithms and complexity. Handbooks in operations research and management science, 4:445?522, 1993.
- R. Ohira and M. S. Islam. Gpu accelerated genetic algorithm with sequence-based clus- tering for ordered problems. In 2020 IEEE Congress on Evolutionary Computation (CEC), pages 1?8. IEEE, 2020.
- J. Porta, J. Parapar, R. Doallo, F. F. Rivera, I. Santé, and R. Crecente. High performance genetic algorithm for land use planning. Computers, environment and urban systems, 37:45?58, 2013.
- E.-G. Talbi and G. Hasle. Metaheuristics on gpus. J. Parallel Distributed Comput., 73 (1):1?3, 2013.