Phase 1 Project Blog

Andrew A Echeverria
3 min readDec 14, 2020

This project was a definitely a different challenge from what we were doing with the previous lessons in Phase 1. Going from a lesson where I only needed to pass tests to creating a gem from scratch with seemingly ulimated room for different structures of code and creativity was something very interesting. It took me by surprise at first, I wasn’t really sure how to start. So as a result it took a couple days to pick up steam with my coding.

One of the biggest obstacles that I had to overcome throughout this project was scraping. We had not extensively practiced it in previous lessons besides one or two code alongs. So I was not aware of how, depending on how a specific website’s content is built, things can get pretty messy. While I was attempting to scrape Espn’s website for scores of soccer matches. It was becoming frustrating because no matter what I was attempting to scrape (whether it was a div or another aspect of the code) when I put it in my terminal and tried to see what it contained using doc.css(“”); all I would get was an empty array. This was definitely worrying, so I went to office hours with questions on how to more effectively scrape this content. Turned out that the content was unscrapable using Nokogiri because it was written in Javascript and not Ruby. Ultimately, because of the time constraint I decided to not include this idea into my project.

Another challenge for me was figuring out how to scrape only the leagues that I wanted. Espn had a list of many leagues but some were not “leagues” per say, but just tournaments. Most soccer tournaments do not have a standing (or ranking) so that provided a challenge because that meant I could not simply scrape everything on the page, I had to be more precise with what I wanted. Originally I had scraped three tournaments, which was something I wanted to change. So I decided to make the project about only the top five leagues in Europe (Italy, Spain, France, England, Germany). To do this I created an array of only the leagues I wanted to include:

leagues_included = [“English Premier League”, “German Bundesliga”, “French Ligue 1”, “Spanish Primera División”, “Italian Serie A”]

in order to make sure I didn’t get any leagues I didn’t want. Then I iterated through the:

first_leagues_container = doc.css(“#fittPageContainer > div.page-container.cf > div > div > div > div:nth-child(3) > div:nth-child(1) > div > div > div > section > a”)

and

second_leagues_container = doc.css(“#fittPageContainer > div.page-container.cf > div > div > div > div:nth-child(3) > div:nth-child(2) > div > div > div > section > a”)

To get the leagues I wanted. This took sometime to figure out as I had originially scraped through only part of the leagues that were on the page and didn’t want to back track too much. So I decided to create a second_leagues_container = doc.css(“”) in order to effectively scrape without wasting too much time.

The coolest part of the project was working with the the structure of my gem’s output. I found it fascinating that I could make it look however way I wanted to, this is something I have never done before because I’m always so used to just seeing content on a website.

standings_container.each_with_index do |team, index|

name = team.css(“td”)[0].text

points = team.css(“td”)[6].text

wins = team.css(“td”)[2].text

losses = team.css(“td”)[4].text

draws = team.css(“td”)[3].text

puts “#{index + 1}. #{name}, Wins: #{wins}, Losses: #{losses}, Draws: #{draws}, Points: #{points}”

For me deciding which data from the standings from Espn.com was the most exciting part of the project. As one can tell, I decided to use the team’s name, records of wins, losses, draws, and points. Then I put it into a format that I felt most clearly potrayed this to the user. Finding the correct attributes was very simple, I just needed to be aware of the index of the item on the page. When the items are in a row like they were on this particular website, it makes things very easy.

--

--