This discipline is studied within the field of Computers and Information Technology / Software Systems Engineering and aims to familiarize students with the main approaches, models and explanatory theories in the field of acquisition, storage and processing of large volumes of data, used in solving practical applications and problems, with relevance for stimulating students’ learning process.
The discipline addresses the following basic notions as a specific topic, such as: SQL standard for communication with databases, programming in Java, Python, Scala, but also advanced notions such as: Apache Hadoop (HDFS distributed file system + the MapReduce processing framework, including the main elements ecosystem: Sqoop, Flume, NoSQL, Pig, Hive, Yarn, Impala, Zookeeper), specific concepts and principles such as Big Data information processing using platforms such as Spark, Kafka, all of which contribute to the transmission/training to/of students of to deal with the challenges of the 5Vs as well as of the methodological and procedural milestones related to the field of Big Data.