Skip to main content

SHORT QUESTIONS ON DECISION TREE

 

SHORT QUESTIONS ON DECISION TREE


1. What is the main purpose of a decision tree in machine learning?
A decision tree is a tool that helps make decisions by showing possible outcomes step-by-step. It works for both classification (sorting things into categories) and regression (predicting numbers). The tree splits the data into smaller groups based on tests on different features. Each internal node is a question about a feature, and each leaf node is the final decision or prediction.


2. What criterion is commonly used to split nodes in a decision tree?
The most common ways to decide where to split a node in a decision tree are Gini impurity and Information Gain (which uses entropy). These measures show how well a split separates the data into different classes, with the goal of making each group as pure (or similar) as possible.


3. What is a leaf node in a decision tree?
A leaf node, also called a terminal node, is the final point at the end of a path in a decision tree. It represents the model’s ultimate output or decision after all the splits. For classification tasks, the leaf node gives the predicted class or category. For regression tasks, it provides the predicted numerical value. This is where the decision-making process in the tree finishes.




4. What is overfitting in the context of decision trees?
Overfitting happens when a decision tree learns the training data too closely, even picking up on random noise and outliers. This creates a very complex tree that works great on the training data but doesn’t do well on new, unseen data. In other words, the tree fails to generalize and makes poor predictions outside of the data it was trained on.


5. How can you prevent a decision tree from overfitting?
Overfitting can be avoided using different techniques. These include pruning, which means cutting off unnecessary branches of the tree; setting a maximum depth to limit how deep the tree can grow; requiring a minimum number of samples in each leaf or split to avoid making decisions based on very few points; or using ensemble methods like Random Forests, which combine many trees to improve accuracy and reduce overfitting.


6. Is a decision tree suitable for both classification and regression?
Yes, decision trees can be used for both classification and regression. In classification, the tree predicts which category or class a data point belongs to. In regression, the tree predicts a continuous number by calculating the average of the values in the leaf nodes.



7. What is the difference between Gini impurity and entropy?
Both Entropy and Gini impurity are ways to measure how good a split is when building a decision tree. Entropy, from information theory, measures how mixed or uncertain the data is after the split—higher entropy means more disorder. Gini impurity measures how often a randomly chosen data point would be wrongly classified if it were assigned a label based on the split. Gini impurity is usually faster to calculate and is often preferred in practice because of its simplicity.

Comments

Popular posts from this blog

Chap#10

Network topologies Definition: Network topologies define how nodes (processors/computers) are interconnected in parallel and distributed systems. The choice of topology affects performance, scalability, and cost. Key Metrics: Degree: Number of links per node. (Formula: deg = connections per node) Example: In a linear array, each node (except ends) has 2 links. Diameter: Longest shortest path between any two nodes. (Formula: diam = max distance) Example: Linear array with 8 nodes has diameter 7 (P₀ to P₇). Bisection Width: Minimum links to cut to split the network into two halves. (Formula: bw = min cuts) Example: Binary tree has bw=1 (cutting the root disconnects it).4 1. Linear Array Define : Nodes are connected one after another in a straight line. Each node (except the ends) connects to two neighbors one on the left and one on the right. Explanation : Simple to build and easy to understand, but not efficient for large networks. Long distance between farthest nodes makes comm...
Asymmetric-key algorithms are algorithms used in cryptography that use two different keys  a public key for encryption and a private key for decryption. These keys are mathematically related, but the private key cannot be easily derived from the public key. Types: RSA (Rivest–Shamir–Adleman): It uses large prime numbers to generate the key pair and supports both encryption and digital signatures DSA (Digital Signature Algorithm): DSA is primarily used for creating digital signatures, ensuring the authenticity. Symmetric-key algorithms are algorithms for cryptography that use the same cryptographic keys for both encryption of plaintext and decryption of ciphertext  Types: Stream Cipher:  Stream Cipher Converts the plain text into cipher text by taking 1 byte of plain text at a time. Block cipher: Converts the plain text into cipher text by taking plain text's block at a time DES? DES stands for Data Encryption Standard . It is a symmetric-key algorithm used to enc...

Ai Mental Health & Cyber Safety Presentation

Module A - The Normalization Engine Linguistic Challenge: Roman Urdu lacks standardized orthography (e.g., "kesa" vs "kaisa"), creating orthographic "noise" that significantly degrades the accuracy of downstream AI models. Technical Role: Acts as a Sequence-to-Sequence (Seq2Seq) transliteration and lexical normalization layer to standardize inputs before analysis. Model: A specialized transformer architecture, specifically m2m100 fine-tuned on parallel corpora or UrduParaphraseBERT. Primary Dataset: Roman-Urdu-Parl (RUP). A large-scale parallel corpus of 6.37 million sentence pairs designed to support machine transliteration and word embedding training. Link: https://arxiv.org/abs/2503.21530 Outcome: Reduces orthographic noise by achieving up to 97.44% Char-BLEU accuracy for Roman-Urdu to Urdu conversion, ensuring Module B receives high-quality "clean" data for risk analysis. Module B - Risk Stratification (BERT) Heading: The "Safety ...