🧠 Introduction to AI
Neu-Ulm University of Applied Sciences
April 13, 2024
In preparation for the lecture, you need to read Stephen Wolfram’s article on what ChatGPT is doing and why it works
If you have only limited understanding of what GenAI is, please go through: Geeks for Geeks articla on the basis of generative AI
If you want to do a deepdive, please consider working through Microsoft’s Artificial Intelligence for Beginners - A Curriculum
What are the two things you have newly learned about deep learning?
Identify, list and explain the key concepts discussed in the article in your own words (e.g., temperature).
Reflect on how these concepts contribute to ChatGPT’s functionality.
Explain in your own words what a “loss function”, sometimes also called “cost function”, is.
How does the loss function change over the course of training a neural network?
Analyze following statements and determine if they are true or false. Justify your answer.
Form groups of two and do additional research on the architecture and building blocks of the most notable feature of technologies like GPT, so called transformers. Prepare to explain the concept to the group.
Good read: Medium — Transformer Architecture Simplified
Consider what limitations you have perceived and/or heard about when using Large Language Models (LLM). Relate the limitations to the things you have learnt about how LLMs work and find explanations for these limitations. Prepare a short presentation about the most interesting limitation and the explanation you found.
Research about “mega prompts” and create a mega prompt that turns ChatGPT into a research question creator coach that guides you through multiple steps in finding a good research question on a topic that raises your interest.
Create a research question using the coach and reflect if using GPT as a guide is a meaningful strategy.
Identify ethical concerns related to AI and language models, choose one ethical concern and discuss how it applies to ChatGPT.
Transfer learning involves incorporating the knowledge of a pre-trained network into a new model, allowing the new model to learn faster and achieve better performance.
Data augmentation refers to the use of pre-trained networks to generate new training examples, expanding the available data and potentially improving the performance of the new model
The output of the transformer encoder is a higher-dimensional representation of the entire input sequence. It captures not only the meaning of individual words but also the relationships and context between them. This representation is often much richer and more complex than a single embedding vector.