Research project puts Big Data Technologies into Millikin curriculum
"Big Data" has emerged into an important aspect for businesses when it comes to measuring analytics and success. It's changing the way that businesses can market their brands, helping drive more awareness to customers as well as helping companies understand what their customers want.
Some background on big data; it's a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large to be dealt with by traditional data-processing application software. The amount of data circulating today is growing at an exponential rate bringing change to areas like marketing, personalization and business intelligence.
Because of the big data emergence, it's becoming more apparent than ever for information systems (IS) students to understand big data technologies as they enter the workforce. Millikin University graduate Justin DeBo '18 recognized this and decided to use his James Millikin Scholar (JMS) project, while enrolled as a student, to extend Millikin's Management Information Systems curriculum to include more hands-on education related to big data. The idea was formed during DeBo's internship at State Farm Corporate Headquarters in Bloomington, Ill.
"During my internship at State Farm, I worked on the Big Data Platforms support team," said DeBo. "While there, I observed that even experienced IT professionals were inexperienced with big data technologies. I knew that if we could get these technologies integrated into our curriculum, it could help position our IS majors to be able to step into those kinds of jobs."
DeBo incorporated his research with Millikin's Business Intelligence and Big Data course, led by RJ Podeschi, associate professor of information systems and chair of the Tabor School of Business Undergraduate Programs. The course provides students with hands-on experience in data warehousing, data analytics and executive dashboards through real-world data sets and applications.
"We got to a point in the class where we could discuss how things were structured, but we were in a situation where students didn't have experience dealing with an open-source framework called Hadoop," said Podeschi. "Justin felt like we could do more with this class in which we could introduce some hands-on labs."
For his research, DeBo wanted to evaluate two different methods of providing students exposure to Hadoop through either an on-premise cluster or virtual machines. Hadoop is a framework that uses a network of many computers to solve problems involving massive amounts of data.
Through corporate donations and departmental funds from State Farm, the class acquired two used HP servers and sufficient hard disk storage and memory for the on-premise Hadoop Cluster. For the virtual machine, the class used Cloudera QuickStart, a free virtual software platform for data engineering and analytics.
Both environments allowed students to see how the Hadoop ecosystem functions.
"During these labs, students got the chance to experiment with data management and analysis tools to get an end-to-end view of how a big data application would be implemented," said DeBo.
Podeschi implemented the hands-on activities during the fall 2018 semester. "There are some pieces that went well and some pieces that need be adjusted, but it's something that I continue to plan on incorporating into the course," said Podeschi.
After weighing the pros and cons, it was determined that a Hadoop cluster was too administratively taxing to manage, making Cloudera QuickStart better suited for students to gain initial exposure to Hadoop as well as other tools such as Hive and Apache Spark.
"What the students found was that it wasn't as foreign to them as they thought," said Podeschi. "A lot of the things that happened in this big data environment had language that was similar in the database class."
DeBo and Podeschi co-authored and submitted their research to the EDSIG Conference on Information Systems and Computing Education in May 2018. They presented their research at the conference in November 2018 in Norfolk, Va., and based on the results, their research was accepted for publication in the Aug. 2019 edition of the Information Systems Education Journal, a peer-reviewed academic journal published by information systems and computing academic professionals six times per year.
To have a published article, all submissions must go through a rigorous evaluation process involving at least three blind reviews by qualified academic, industrial or governmental computing professionals.
"I was incredibly proud, not only because it was recognition of the work RJ and I did putting it together, but because it was validation that other schools saw the value in the skills we are trying to prepare our students with at Millikin," said DeBo.
When asked about the importance of understanding big data as an undergraduate, DeBo said, "I think in general our IS coursework is centered mostly on traditionally IT skills (software development, database management, infrastructure, etc.). I think by incorporating new technology like Spark and Hadoop into courses, it makes the curriculum more agile and sets it up to better adapt to the ever changing technology landscape."
DeBo added, "I'm thrilled about how successful the project was, but it wouldn't have been possible without the experience I gained during my internship, and the strong and supportive partnership between Millikin and State Farm."
DeBo currently works as a software developer for State Farm helping support the company's test data management applications.