Which neural network architecture is commonly used in Large Language Models (LLMs)?
o Transformer
o Recurrent Neural Networks (RNNs)
o Long Short-Term Memory (LSTM)
o Convolutional Neural Networks (CNNs)
Transformers
Which Statement is true?
o Pre-training a LLM requires several weeks of training with a cluster of many GPUs.
o A pre-trained LLM on crawled data learns so much from the world that it can be directly used as a chat assistant.
o Fine-tuning a LLM needs less computational data and resources than pre-training, but high-quality data.
o Showing an LLM a few shots improves the quality of the output.
o Forcing the LLM to give direct answers has no impact on quality.
o RAG models empower LLMs to enhance their understanding and text generation abilities by accessing knowledge beyond their training data.
o Hallucinations are a solved problem for LLMs: The problem was to distinguish between real sensations and invented perceptions.
Which programming language(s) is/are widely used in Data Science for data analysis?
o Python
o Java
o JavaScript
o R
o C
o C++
Python
R
What is the primary goal of data visualization in the context of the data science pipeline?
o Automating data processing tasks
o Increasing data storage capacity
o Enhancing data security
o Facilitating data interpretation and communication
Choose whether the following statement is true or false: Unstructured data is not organized.
o False
o Neither nor.
o True
True
Order the examples (Nominal, ordinal, binary)
o List of names
o Passed test
o Amazon review rating
o Colors: red, blue, green
o Integer
o Presence or absence
o List of names -> ordinal
o Passed test -> binary
o Amazon review rating -> ordinal
o Colors: red, blue, green -> nominal
o Integer -> ordinal
o Presence or absence -> binary
Common Python libraries: (TensorFlow, Requests, Pandas, Gap, BeautifulSoup, Shiny, Seaborn, HTTPServer)
o is a popular open-source Python library used for data manipulation and analysis.
o is a Python visualization library based on Matplotlib.
o is a popular Python library for making HTTP requests. It simplifies sending HTTP requests and receiving responses from web services or APIs.
o is an open-source deep learning framework developed by Google.
o is a Python library that parses HTML and XML documents.
è Pandas
è Seaborn
è Requests
è TensorFlow
è BeautifulSoup
How does feature selection contribute to improving the performance of machine learning models
o By reducing the number of irrelevant features
o By duplicating existing features
o By randomizing feature values
o By increasing the complexity of the model
Data _______________ refers to graphical representation of data.
o Visualisation
o Plotting
o Handling
o Analysis
Visualization
The following code snippet is given
Tick the correct statements.
o The scatterplot is saved to a file named 'scatterplot.png'.
o The x-axis label is 'Height (cm)' and the y-axis label is 'Weight (kg)'.
o The DataFrame 'df' contains columns for 'Age', 'Height', and 'Weight'.
o The DataFrame 'data' is created using a dictionary.
o The scatterplot is created using seaborn's 'lineplot' function.
o The age is considered in the visualisation.
Drag and Drop
è ... tasks involve training models to predict certain parts of the input from other parts, leveraging the inherent structure or relationships within the data itself to generate supervision signals without requiring external labels.
è ... involves training agents to make sequential decisions through interaction with an environment, where they learn to maximize cumulative rewards by exploring actions and observing the consequences of their decisions.
è ... combines elements of supervised and unsupervised learning by utilizing both labeled and unlabeled data to improve model performance, leveraging the potentially vast amounts of unlabeled data to enhance learning from limited labeled examples.
è ... involves training algorithms on unlabeled data, aiming to uncover hidden patterns or structures within the data without explicit guidance on what to look for.
è In ..., an algorithm learns from labeled data, where input-output pairs are provided, allowing it to make predictions or decisions based on new input data.
o Unsupervised learning
o Reinforcement learning
o Supervised learning
o Semi-supervised learning
o Self-supervised learning
A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
Your spam filter is a machine learning program that, given examples of spam emails (flagged by users) and examples of regular emails (nonspam, also called “ham”), can learn to flag spam. The examples that the system uses to learn are called the training set. Each training example is called a training instance (or sample). The part of a machine learning system that learns and makes predictions is called a model. Neural networks and random forests are examples of models.
è is to flag spam for new emails
è needs to be defined; for example accuracy
è is the training data
o The performance measure P
o The experience E
o Task T
o Hidden layers in a neural network are always the last layer of the network, responsible for making the final predictions.
o A single perceptron can learn complex decision boundaries and effectively model complex relationships between inputs and outputs commonly found in real-world datasets.
o Backpropagation is a technique used to train neural networks by updating the weights based on the error calculated between the predicted output and the actual output.
o The purpose of a loss function in deep learning is to measure how well the neural network's output matches the desired output, guiding the optimization process during training.
o Batches in deep learning refer to the division of the training dataset into smaller subsets, allowing for more efficient computation and optimization of the neural network's parameters.
o ReLU (Rectified Linear Unit) is a commonly used activation function that returns the input if it is positive and zero otherwise, effectively solving the vanishing gradient problem.
o Activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns and relationships in data.
o Activation functions are only used in the output layer of neural networks and have no impact on the intermediate layers during the forward pass.
o The identity activation function, which simply outputs the input without any transformation, is commonly used in deep neural networks to introduce non-linearity.
o In machine learning, the training set is used to train the model, the validation set is used to tune hyperparameters, and the test set is used to evaluate the model's performance on unseen data.
o Splitting the dataset into training, validation, and test sets helps ensure that the model's performance generalizes well to unseen data by providing a way to assess its performance on data it hasn't been trained on.
o The training set is used to fine-tune the model's parameters, while the validation set is used to train the model on unseen data
o The test set is typically larger than the training set and is used to optimize the model's performance during the training phase.
What result do I get?
T = torch.tensor([[[[1, 2, 3],[3, 6, 9],[2, 4, 5]]]])
T.shape
o torch.Size([3, 3, 1, 1])
o torch.Size([9])
o torch.Size([1, 1, 3, 3])
o torch.Size([3, 3])
torch.nn.functional.sigmoid
torch.nn.functional.softmax
torch.nn.linear
Binary Classification
Multi class classification
Regression problem
Which output layer activation function is suitable for which problem?
Multiple choice
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.layer_1 = nn.Linear(20, 100)
self.layer_2 = nn.Linear(100, 100)
self.layer_3 = nn.Linear(100, 1)
def forward(self, x):
x = self.layer_1(x)
x = torch.nn.functional.relu(x)
x = self.layer_2(x)
x = self.layer_3(x)
return x
o The following network can be used for regression.
o The following network has three hidden layers.
o The following network can be used for binary classification.
o The input dimension of this net is 20.
o The following network uses three non linear activation functions.
o The first Layer has 100 neurons.
num_epochs = 100
for epoch in range(num_epochs):
for X, y in train_dataloader:
optimizer.zero_grad()
pred = model(X)
loss = loss_function(pred, y.unsqueeze(-1))
loss.backward()
optimizer.step()
o All errors over time are saved in the loss variable.
o The following training uses the trainign data 100 times.
o This training algorithm has a termination criterion which prevents overfitting.
o In the following training, the data is evaluated on the training data as well as on the test data.
o The loss indicates the accuracy of the model as a percentage [0-100%].
Single Choice
AI is set to play a crucial role in next industrial revolution, also known as Industry ..... .
o 2.0
o 5.0
o 4.0
o 3.0
o 1.0
4.0
Lückentext mit Dropdown
(Causal AI, Knowledge Graphs, Operational AI systems, smart robots, AI simulation, Responsible AI, Multiagent systems (MAS), Composite AI, Edge AI)
o is an approach to developing and deploying artificial intelligence from both an ethical and legal standpoint.
o are AI-powered machines designed to autonomously execute one or more physical tasks.y.
o refers to the deployment of AI algorithms and AI models directly on local edge devices such as sensors or Internet of Things (IoT) devices, which enables real-time data processing and analysis without constant reliance on cloud infrastructure.
o refers to developing AI agents and the simulated environments in which they can be trained, tested and sometimes deployed.
è Responsible AI
è smart robots
è Edge AI
è AI simulation
Gartner`s Hype Cycles Phases
Sort Gartner`s Hype Cycles phases from its potential technology breakthrough (top) to its technology's broad market applicability and relevance (bottom).
Innovation tiger
Plateau of Productivity
Through of Disillusionment
Slope of Enlightment
Peak of inflated expectations
What does GPT stand for?
o Generative Purpose Transformer
o General Pretrained Transformer
o Generative Pretrained Transformer
What is not part of a Transformer?
o Feed Forward Layer
o Convolutional Layer
o Embedding
COnvolutional Layer
What is the functionality of this code snippet?
o Tokenization
o Add & Norm
o Attention Layer
Tokenization
What happens during the image to patches step in Vision Transformers?
o The input image is converted into grayscale to simplify computations
o The input image's color depth is reduced to enhance processing speed.
o The input image is divided into a grid of fixed-size patches for processing
o The input image is resized to match the network's input requirements
The input image is divided into a grid of fixed-size patches for processing
Which of the following statements regarding Vision Transformers (ViTs) are correct?
o ViTs are based on Transformers and represent an Encoder-Only architecture.
o ViTs do not pass vectors to the Transformer Encoder, but a sequence of position embeddings.
o The output of the Patch Embedding step is a sequence of embeddings.
o After Normalization, the inputs are passed to the Multi-Head Attention block which is part of the Decoder.
o Convolutional Neural Networks (CNNs) use the same architecture as ViTs.
o Neural networks are part of ViTs.
What's the difference in the architecture between Vision Transformers and Sequence2Sequence models?
o ViT are Decoder-Only, Sequence2Sequence are Encoder-Decoder.
o ViT use Convolutional Layers, Sequence2Sequence use Fully Connected Layers.
o ViT are Encoder-Only, Sequence2Sequence are Encoder-Decoder.
o ViT are based on RNNS, Sequence2Sequence are based on CNNs.
Which of the following best describes the two key aspects involved in improving a language model's performance through “Generate Knowledge Prompting”?
o Knowledge Generation and Knowledge Dissemination
o Knowledge Generation and Knowledge Integration
o Knowledge Integration and Knowledge Preservation
o Knowledge Generation and Knowledge Retention
Which of the following prompting techniques does not exist in prompt engineering?
o Tree of Thoughts
o Reverse-shot prompting
o Retrieval Augmented Generation (RAG)
o Self-Consistency
Why do these hallucinations occur?
o Hallucinations occur due to the system being overloaded with too much data.
o Hallucinations occur when the system lacks sufficient computational power
o Hallucinations occur because the system is running on outdated hardware.
o Hallucinations occur due to incomplete or biased data
What is NOT a reason for splitting the data?
o Contextual Understanding
o Data Integrity
o Memory Efficiency
o Parallel Processing
In the context of Retrieval Augmented Generation (RAG), why is retrieving contextual documents important?
o To simplify the training process
o To reduce the size of the model
o To provide relevant information that enhances the model's responses
o To increase the computational requirements
What is the main purpose of a vector database in the context of Retrieval-Augmented Generation (RAG)?
o To enhance the model's language generation capabilities
o To store traditional relational data
o To store and retrieve embedding vectors representing text
o To reduce the overall size of the dataset
What is the primary advantage of using Retrieval-Augmented Generation (RAG) in large language models?
o It can access and use external knowledge
o It decreases the computational load on the model
o It enhances the model's creativity
o It reduces the need for training data
Which of the following is a primary limitation of Retrieval-Augmented Generation (RAG)?
o It always provides the most accurate and up-to-date information
o It can struggle with purely creative tasks
o It cannot access external databases
o It requires significant computational resources to function effectively
Balancing Exploration and Exploitation: In a Deep Q-Learning algorithm, what strategy should an agent use to balance exploration (trying new actions) and exploitation (using known actions) to maximize long-term rewards?
A. Always choose the action with the highest immediate reward.
B. Always choose new actions to explore all possible outcomes.
C. Use a combination of exploring new actions and exploiting known actions based on the agent’s learned Q-values.
D. Choose actions randomly to ensure equal exploration of all options.
Question 2: Improving Stability in DRL Training: What technique can be used in Deep Q-Networks (DQN) to improve the stability and efficiency of the training process?
A. Update the Q-values after every single action to maintain accuracy.
B. Use experience replay to store and reuse past experiences for training.
C. Train the agent on different tasks simultaneously to improve generalization.
D. Increase the learning rate to speed up the training process.
Question 3: Application of DRL in Gaming: Which achievement best demonstrates the application of Deep Reinforcement Learning in gaming?
A. A chess algorithm that always uses the same opening move.
B. A supervised learning model that identifies objects in images.
C. A DRL agent that learns to play and excels at various Atari games without prior knowledge of the game rules.
D. A simple decision tree used for sorting data.
Last changed6 months ago