CarPricePred

CarPricePred is my first official project and it is developed in cooperation with Davide Picello, a strict friend of mine. The work started back in december 2023 and ended in the first months of 2024, after collecting ideas and putting effort day by day, combining this with study.

Preface

This project was born outside of any academic or school context, stemming from a real need of mine. At the time i was getting into theused car market and for each insertion i didn't know if the price was good or bad. The only way to figure it out was to search for similar listings and check if the price was in line. One day, however, he had an idea that i proposed to my friend Davide. What if, instead of manually checking the price every time, we created software that could calculate it in an autonomous way? The idea appealed to both of them, and so they began to develop it.

Development

The Idea Anthony, a student of Law and Technology, and Davide, a student of Computer Science, both at the University of Padua, but above all, great friends since they were kids, started exchanging ideas and viewpoints to realize this idea. Our different backgrounds further stimulated this exchange. In the end, we decided to create software capable of calculating the correct price of a used car every time, based on the following parameters: Many more parameters could have been used, such as the type of transmission, fuel type, engine power, number of doors, emission class, interiors... But even with just these few parameters, the level of accuracy achieved was such that we decided to stick with these.

Implementation

To make all this possible, we decided to create a neural network that, based on its training dataset, would then calculate a price. First of all, it was necessary to obtain a good data base. To do this, we created a web scraper, a software program capable of retrieving information from web pages. In particular, our software was designed to work on the famous used car trading site “AutoScout24,” which, with its database of 2 million cars for sale across Europe (400,000 of which in Italy), was undoubtedly the place with the most relevant data we could find on the web. After hours and hours of our computers continuously extracting data, we ended up with a database of 160,000 cars with their respective features and prices. A great starting point for training our model.

Model

The backend of the entire project was written in Python 3.11, an excellent language for this type of creation given the vast array of libraries dedicated to data science and data management. The model itself is a neural network trained with a Backpropagation algorithm. The number of layers and neurons changed significantly as the testing phases progressed, but techniques such as regularization, drop-out, and activation functions (e.g., ReLU) were consistently used. The most demanding part for the team was data processing and the selection of variables and targets. Data manipulation was performed as follows: The target is the car's price, on which the model was trained to recognize patterns of the car: from luxury car prices to the depreciation of economy cars to the appreciation of rarer vehicles. In the final stage, the model recognizes about 2,000 car models with an accuracy close to 95%. Release The entire project was then given an interface through a web page written in Flask, available at the link below. Setting up a server was a very educational activity that doesn't come up every day. Currently, the web page has only the essentials to function; no aesthetic improvements have been implemented.

Conclusions

This project has been extremely educational and fun; using new technologies and facing new challenges every day has made us realize how stimulating, yet demanding, it is to take a product from conception to distribution. Doing it with a friend made it even more enjoyable and meaningful.

Try it yourself

Repo:

https://github.com/DavidePicc/CarPricePred