undefined

Buffl

Organizations

by Max S.

Which neural network architecture is commonly used in Large Language Models (LLMs)?

o Transformer

o Recurrent Neural Networks (RNNs)

o Long Short-Term Memory (LSTM)

o Convolutional Neural Networks (CNNs)

Transformers

Which Statement is true?

o Pre-training a LLM requires several weeks of training with a cluster of many GPUs.

o A pre-trained LLM on crawled data learns so much from the world that it can be directly used as a chat assistant.

o Fine-tuning a LLM needs less computational data and resources than pre-training, but high-quality data.

o Showing an LLM a few shots improves the quality of the output.

o Forcing the LLM to give direct answers has no impact on quality.

o RAG models empower LLMs to enhance their understanding and text generation abilities by accessing knowledge beyond their training data.

o Hallucinations are a solved problem for LLMs: The problem was to distinguish between real sensations and invented perceptions.

o Pre-training a LLM requires several weeks of training with a cluster of many GPUs.

o Fine-tuning a LLM needs less computational data and resources than pre-training, but high-quality data.

o Showing an LLM a few shots improves the quality of the output.

o RAG models empower LLMs to enhance their understanding and text generation abilities by accessing knowledge beyond their training data.

Which programming language(s) is/are widely used in Data Science for data analysis?

o Python

o Java

o JavaScript

o R

o C

o C++

Python

What is the primary goal of data visualization in the context of the data science pipeline?

o Automating data processing tasks

o Increasing data storage capacity

o Enhancing data security

o Facilitating data interpretation and communication

Choose whether the following statement is true or false: Unstructured data is not organized.

o False

o Neither nor.

o True

True

Order the examples (Nominal, ordinal, binary)

o List of names

o Passed test

o Amazon review rating

o Colors: red, blue, green

o Integer

o Presence or absence

Order the examples (Nominal, ordinal, binary)

o List of names -> ordinal

o Passed test -> binary

o Amazon review rating -> ordinal

o Colors: red, blue, green -> nominal

o Integer -> ordinal

o Presence or absence -> binary

Common Python libraries: (TensorFlow, Requests, Pandas, Gap, BeautifulSoup, Shiny, Seaborn, HTTPServer)

o is a popular open-source Python library used for data manipulation and analysis.

o is a Python visualization library based on Matplotlib.

o is a popular Python library for making HTTP requests. It simplifies sending HTTP requests and receiving responses from web services or APIs.

o is an open-source deep learning framework developed by Google.

o is a Python library that parses HTML and XML documents.

Common Python libraries: (TensorFlow, Requests, Pandas, Gap, BeautifulSoup, Shiny, Seaborn, HTTPServer)

o is a popular open-source Python library used for data manipulation and analysis.

è Pandas

o is a Python visualization library based on Matplotlib.

è Seaborn

o is a popular Python library for making HTTP requests. It simplifies sending HTTP requests and receiving responses from web services or APIs.

è Requests

o is an open-source deep learning framework developed by Google.

è TensorFlow

o is a Python library that parses HTML and XML documents.

è BeautifulSoup

How does feature selection contribute to improving the performance of machine learning models

o By reducing the number of irrelevant features

o By duplicating existing features

o By randomizing feature values

o By increasing the complexity of the model

o By reducing the number of irrelevant features

Data _______________ refers to graphical representation of data.

o Visualisation

o Plotting

o Handling

o Analysis

Visualization

The following code snippet is given

Tick the correct statements.

o The scatterplot is saved to a file named 'scatterplot.png'.

o The x-axis label is 'Height (cm)' and the y-axis label is 'Weight (kg)'.

o The DataFrame 'df' contains columns for 'Age', 'Height', and 'Weight'.

o The DataFrame 'data' is created using a dictionary.

o The scatterplot is created using seaborn's 'lineplot' function.

o The age is considered in the visualisation.

o The x-axis label is 'Height (cm)' and the y-axis label is 'Weight (kg)'.

o The DataFrame 'df' contains columns for 'Age', 'Height', and 'Weight'.

o The DataFrame 'data' is created using a dictionary.

Drag and Drop

è ... tasks involve training models to predict certain parts of the input from other parts, leveraging the inherent structure or relationships within the data itself to generate supervision signals without requiring external labels.

è ... involves training agents to make sequential decisions through interaction with an environment, where they learn to maximize cumulative rewards by exploring actions and observing the consequences of their decisions.

è ... combines elements of supervised and unsupervised learning by utilizing both labeled and unlabeled data to improve model performance, leveraging the potentially vast amounts of unlabeled data to enhance learning from limited labeled examples.

è ... involves training algorithms on unlabeled data, aiming to uncover hidden patterns or structures within the data without explicit guidance on what to look for.

è In ..., an algorithm learns from labeled data, where input-output pairs are provided, allowing it to make predictions or decisions based on new input data.

o Unsupervised learning

o Reinforcement learning

o Supervised learning

o Semi-supervised learning

o Self-supervised learning

Drag and Drop

o Unsupervised learning

è ... involves training algorithms on unlabeled data, aiming to uncover hidden patterns or structures within the data without explicit guidance on what to look for.

o Reinforcement learning

o Supervised learning

è In ..., an algorithm learns from labeled data, where input-output pairs are provided, allowing it to make predictions or decisions based on new input data.

o Semi-supervised learning

o Self-supervised learning

Drag and Drop

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

Your spam filter is a machine learning program that, given examples of spam emails (flagged by users) and examples of regular emails (nonspam, also called “ham”), can learn to flag spam. The examples that the system uses to learn are called the training set. Each training example is called a training instance (or sample). The part of a machine learning system that learns and makes predictions is called a model. Neural networks and random forests are examples of models.

è is to flag spam for new emails

è needs to be defined; for example accuracy

è is the training data

o The performance measure P

o The experience E

o Task T

o The performance measure P

è needs to be defined; for example accuracy

o The experience E

è is the training data

o Task T

è is to flag spam for new emails

Which Statement is true?

o Hidden layers in a neural network are always the last layer of the network, responsible for making the final predictions.

o A single perceptron can learn complex decision boundaries and effectively model complex relationships between inputs and outputs commonly found in real-world datasets.

o Backpropagation is a technique used to train neural networks by updating the weights based on the error calculated between the predicted output and the actual output.

o The purpose of a loss function in deep learning is to measure how well the neural network's output matches the desired output, guiding the optimization process during training.

o Batches in deep learning refer to the division of the training dataset into smaller subsets, allowing for more efficient computation and optimization of the neural network's parameters.

o Backpropagation is a technique used to train neural networks by updating the weights based on the error calculated between the predicted output and the actual output.

o The purpose of a loss function in deep learning is to measure how well the neural network's output matches the desired output, guiding the optimization process during training.

o Batches in deep learning refer to the division of the training dataset into smaller subsets, allowing for more efficient computation and optimization of the neural network's parameters.

Which Statement is true?

o ReLU (Rectified Linear Unit) is a commonly used activation function that returns the input if it is positive and zero otherwise, effectively solving the vanishing gradient problem.

o Activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns and relationships in data.

o Activation functions are only used in the output layer of neural networks and have no impact on the intermediate layers during the forward pass.

o The identity activation function, which simply outputs the input without any transformation, is commonly used in deep neural networks to introduce non-linearity.

o ReLU (Rectified Linear Unit) is a commonly used activation function that returns the input if it is positive and zero otherwise, effectively solving the vanishing gradient problem.

o Activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns and relationships in data.

Which Statement is true?

o In machine learning, the training set is used to train the model, the validation set is used to tune hyperparameters, and the test set is used to evaluate the model's performance on unseen data.

o Splitting the dataset into training, validation, and test sets helps ensure that the model's performance generalizes well to unseen data by providing a way to assess its performance on data it hasn't been trained on.

o The training set is used to fine-tune the model's parameters, while the validation set is used to train the model on unseen data

o The test set is typically larger than the training set and is used to optimize the model's performance during the training phase.

o In machine learning, the training set is used to train the model, the validation set is used to tune hyperparameters, and the test set is used to evaluate the model's performance on unseen data.

What result do I get?

T = torch.tensor([[[[1, 2, 3],[3, 6, 9],[2, 4, 5]]]])

T.shape

o torch.Size([3, 3, 1, 1])

o torch.Size([9])

o torch.Size([1, 1, 3, 3])

o torch.Size([3, 3])

o torch.Size([1, 1, 3, 3])

Drag and Drop

torch.nn.functional.sigmoid
torch.nn.functional.softmax
torch.nn.linear

Binary Classification

Multi class classification

Regression problem

Which output layer activation function is suitable for which problem?

Multiple choice

class NeuralNetwork(nn.Module):

def __init__(self):

super().__init__()

self.layer_1 = nn.Linear(20, 100)

self.layer_2 = nn.Linear(100, 100)

self.layer_3 = nn.Linear(100, 1)

def forward(self, x):

x = self.layer_1(x)

x = torch.nn.functional.relu(x)

x = self.layer_2(x)

x = torch.nn.functional.relu(x)

x = self.layer_3(x)

return x

o The following network can be used for regression.

o The following network has three hidden layers.

o The following network can be used for binary classification.

o The input dimension of this net is 20.

o The following network uses three non linear activation functions.

o The first Layer has 100 neurons.

o The following network can be used for regression.

o The input dimension of this net is 20.

o The first Layer has 100 neurons.

Multiple choice

num_epochs = 100

for epoch in range(num_epochs):

for X, y in train_dataloader:

optimizer.zero_grad()

pred = model(X)

loss = loss_function(pred, y.unsqueeze(-1))

loss.backward()

optimizer.step()

o All errors over time are saved in the loss variable.

o The following training uses the trainign data 100 times.

o This training algorithm has a termination criterion which prevents overfitting.

o In the following training, the data is evaluated on the training data as well as on the test data.

o The loss indicates the accuracy of the model as a percentage [0-100%].

o The following training uses the trainign data 100 times.

Single Choice

AI is set to play a crucial role in next industrial revolution, also known as Industry ..... .

o 2.0

o 5.0

o 4.0

o 3.0

o 1.0

4.0

Lückentext mit Dropdown

(Causal AI, Knowledge Graphs, Operational AI systems, smart robots, AI simulation, Responsible AI, Multiagent systems (MAS), Composite AI, Edge AI)

o is an approach to developing and deploying artificial intelligence from both an ethical and legal standpoint.

o are AI-powered machines designed to autonomously execute one or more physical tasks.y.

o refers to the deployment of AI algorithms and AI models directly on local edge devices such as sensors or Internet of Things (IoT) devices, which enables real-time data processing and analysis without constant reliance on cloud infrastructure.

o refers to developing AI agents and the simulated environments in which they can be trained, tested and sometimes deployed.

o is an approach to developing and deploying artificial intelligence from both an ethical and legal standpoint.

è Responsible AI

o are AI-powered machines designed to autonomously execute one or more physical tasks.y.

è smart robots

è Edge AI

o refers to developing AI agents and the simulated environments in which they can be trained, tested and sometimes deployed.

è AI simulation

Gartner`s Hype Cycles Phases

Sort Gartner`s Hype Cycles phases from its potential technology breakthrough (top) to its technology's broad market applicability and relevance (bottom).

Innovation tiger

Plateau of Productivity

Through of Disillusionment

Slope of Enlightment

Peak of inflated expectations

Single Choice

What does GPT stand for?

o Generative Purpose Transformer

o General Pretrained Transformer

o Generative Pretrained Transformer

What is not part of a Transformer?

o Feed Forward Layer

o Convolutional Layer

o Embedding

COnvolutional Layer

What is the functionality of this code snippet?

o Tokenization

o Add & Norm

o Attention Layer

o Feed Forward Layer

Tokenization

Single Choice

What happens during the image to patches step in Vision Transformers?

o The input image is converted into grayscale to simplify computations

o The input image's color depth is reduced to enhance processing speed.

o The input image is divided into a grid of fixed-size patches for processing

o The input image is resized to match the network's input requirements

The input image is divided into a grid of fixed-size patches for processing

Which of the following statements regarding Vision Transformers (ViTs) are correct?

o ViTs are based on Transformers and represent an Encoder-Only architecture.

o ViTs do not pass vectors to the Transformer Encoder, but a sequence of position embeddings.

o The output of the Patch Embedding step is a sequence of embeddings.

o After Normalization, the inputs are passed to the Multi-Head Attention block which is part of the Decoder.

o Convolutional Neural Networks (CNNs) use the same architecture as ViTs.

o Neural networks are part of ViTs.

o ViTs are based on Transformers and represent an Encoder-Only architecture.

o The output of the Patch Embedding step is a sequence of embeddings.

o Neural networks are part of ViTs.

What's the difference in the architecture between Vision Transformers and Sequence2Sequence models?

o ViT are Decoder-Only, Sequence2Sequence are Encoder-Decoder.

o ViT use Convolutional Layers, Sequence2Sequence use Fully Connected Layers.

o ViT are Encoder-Only, Sequence2Sequence are Encoder-Decoder.

o ViT are based on RNNS, Sequence2Sequence are based on CNNs.

o ViT are Encoder-Only, Sequence2Sequence are Encoder-Decoder.

Which of the following best describes the two key aspects involved in improving a language model's performance through “Generate Knowledge Prompting”?

o Knowledge Generation and Knowledge Dissemination

o Knowledge Generation and Knowledge Integration

o Knowledge Integration and Knowledge Preservation

o Knowledge Generation and Knowledge Retention

o Knowledge Generation and Knowledge Integration

Which of the following prompting techniques does not exist in prompt engineering?

o Tree of Thoughts

o Reverse-shot prompting

o Retrieval Augmented Generation (RAG)

o Self-Consistency

o Reverse-shot prompting

Why do these hallucinations occur?

o Hallucinations occur due to the system being overloaded with too much data.

o Hallucinations occur when the system lacks sufficient computational power

o Hallucinations occur because the system is running on outdated hardware.

o Hallucinations occur due to incomplete or biased data

What is NOT a reason for splitting the data?

o Contextual Understanding

o Data Integrity

o Memory Efficiency

o Parallel Processing

o Data Integrity

In the context of Retrieval Augmented Generation (RAG), why is retrieving contextual documents important?

o To simplify the training process

o To reduce the size of the model

o To provide relevant information that enhances the model's responses

o To increase the computational requirements

o To provide relevant information that enhances the model's responses

What is the main purpose of a vector database in the context of Retrieval-Augmented Generation (RAG)?

o To enhance the model's language generation capabilities

o To store traditional relational data

o To store and retrieve embedding vectors representing text

o To reduce the overall size of the dataset

o To store and retrieve embedding vectors representing text

What is the primary advantage of using Retrieval-Augmented Generation (RAG) in large language models?

o It can access and use external knowledge

o It decreases the computational load on the model

o It enhances the model's creativity

o It reduces the need for training data

o It can access and use external knowledge

Which of the following is a primary limitation of Retrieval-Augmented Generation (RAG)?

o It always provides the most accurate and up-to-date information

o It can struggle with purely creative tasks

o It cannot access external databases

o It requires significant computational resources to function effectively

o It can struggle with purely creative tasks

Balancing Exploration and Exploitation: In a Deep Q-Learning algorithm, what strategy should an agent use to balance exploration (trying new actions) and exploitation (using known actions) to maximize long-term rewards?

A. Always choose the action with the highest immediate reward.

B. Always choose new actions to explore all possible outcomes.

C. Use a combination of exploring new actions and exploiting known actions based on the agent’s learned Q-values.

D. Choose actions randomly to ensure equal exploration of all options.

C. Use a combination of exploring new actions and exploiting known actions based on the agent’s learned Q-values.

Question 2: Improving Stability in DRL Training: What technique can be used in Deep Q-Networks (DQN) to improve the stability and efficiency of the training process?

A. Update the Q-values after every single action to maintain accuracy.

B. Use experience replay to store and reuse past experiences for training.

C. Train the agent on different tasks simultaneously to improve generalization.

D. Increase the learning rate to speed up the training process.

B. Use experience replay to store and reuse past experiences for training.

Question 3: Application of DRL in Gaming: Which achievement best demonstrates the application of Deep Reinforcement Learning in gaming?

A. A chess algorithm that always uses the same opening move.

B. A supervised learning model that identifies objects in images.

C. A DRL agent that learns to play and excels at various Atari games without prior knowledge of the game rules.

D. A simple decision tree used for sorting data.

C. A DRL agent that learns to play and excels at various Atari games without prior knowledge of the game rules.