MCA-5 Data warehouse & Data Mining
PART-A
1. The cuboid that holds the lowest level of summarization is called the ________
(A) D – cubiod (B) Apex – cuboid (C) Base – cuboid (D) None
2. A multidimensional data model is typically organized around a _____________ .
(A) Logical table (B) Central table theme (C) Dimension table (D) None
3. The data cube is a ________ for multidimensional data storage.
(A) Instance (B) Schema (C) Metaphor (D) None
4. The cuboid which holds the highest level of summarization is called _____
(A) 1 – D cuboid (B) High – cuboid (C) Base cuboid (D) Apex – cuboid
5. In star schema, each dimension is represented by only _________that contains a set of ___
(A) One table (B) Many table (C) Property (D) Attribute
(1) A C (2) BC (3) AD (D) AD
6. OLAP provides a ______ environment for interactive data analysis
(A) User – safety (B) User – required (C) User – friendly (D) None
7. Each dimension contain multiple level of ________ defined by concept hierarchies
(A) Behaviour (B) Property (C) Obstraction (D) Non e
8. The data cube that contains all the dimensions into aggregated we refer to this cube as ____.
(A) Central cube (B) Centrel core (C) Base cube (D) None
9. The operation rollup is also known as _____
(A) Roll – pit (B) Drill – pit (C) Drill – up (D) None
10. The rollup operation shown aggregates the data by _________ the concept hierarchy.
(A) Decending (B) Ascending (C) Aggregating (D) None
11. Real word data tend to be noisy and ________.
(A) Consistent (B) Inconsistent (C) Complete (D) None
12. Data cleaning steps attempt to ______ missing values.
(A) Join (B) Complete (C) Fill in (D) None
13. Which approach, of filling the missing values is not feasible and time consuming.
(A) Ignore tuple (B) By manually
(C) Use global constant (D) Use the attributes mean.
14. Noise is a type of _________ in measured variable.
(A) Numeric attribute (B) Random error (C) Technique (D) None
15. In smoothing by bin means each value in a bin is replaced by the _______ value of the bin.
(A) Bin (B) Exact (C) Mean (D) None
16. In smoothing by bin boundaries, the minimum and maximum values in a given bin are identified as the ______ boundaries.
(A) Mean (B) Bin (C) Nearest (D) None
17. Outliers may be detected by __ where similar values are organized into groups
(A) Grouping (B) Joining (C) Clustering (D) None
18. Data can be smoothed by fitting the data to a function such as ____
(A) Aggregation (B) Clustering (C) Regression (D) None
19. Multiple linear regression is an ______ of linear regression where more than two variable are involved and data are bits to multi dimensional surface.
(A) Group (B) Extension (C) Part (D) None
20. Data mining often requires _____ that is merging of data from multiple data stores.
(A) Data aggregation (B) Data integration (C) Data merging (D) None
21. The correlation between attribute A and B can be measured by
(A) (B)
(C) (D)
22. Which is not involve in the data transformation .
(A) Smoothing (B) Aggregation (C) Functional dependence (D) Generalization
23. Z score normalization is also known as ____
(1) Base normalization (2) Zero means normalization (3) I – D mean normalization (4) None
24. Which is not the strategy of data reduction?
(A) Data cube aggregation (B) Dimension reduction
(C) Integrity concept (D) None
25. Data reduction techniques can be use to obtain reduced ______ of data set that is much smaller in volume.
(A) Arrangement (B) Visualization (C) Representation (D) None
26. Stepwise backward elimination procedures starts with the _____ set of attributes.
(A) Empty (B) Full (C) Reduced (D) None
27. Classification and prediction are two forms of ____ that can be used to extract models describing important data.
(A) Data prediction (B) data analysis (C) Data aggregation (D) Data grouping
28. The data tuples analyzed to build the model collectively form the _____
(A) Training data set (B) Training samples (C) Supervised learning (D) None
29. Concept hierarchies may be used for which purpose.
(A) Data cleaning (B) Data transformation (C) Normalization (D) None
30. Which is not the criteria used for classification and prediction method
(A) Speed (B) Predictive accuracy (C) Scalability (D) Reliability
31. In classification by decision tree induction path is traced from the root to __ that hold the class prediction for that sample.
(A) Root (B) Leaf (C) Single branch (D) None
32. When decision trees are built many of the branches may reflect _____ in the training data.
(A) Tree (B) Root (C) Noise (D) None
33. Decision tree starts as a single node representing the __________.
(A) Training data set (B) Training samples (C) Training tuples (D) None
34. The ______ measure is used to select the test attribute at each node in the tree.
(A) Information root (B) Information leaf (C) Information gain (D) None
35. Tree pruning methods address the problems of ________ of the data.
(A) Tree induction (B) Over fitting (C) Statistical reduction (D) None
36. A transaction t is said to support all item I, if I, is present in t ‘t’ is said to support a subset of items _ if ‘t’ support each item I in x.
(A) X A (B) X I (C) A X (D) I X
37. If a rule describe association between quantitative item or attributes then it is known as ________.
(A) Relative rule (B) Interval rule (C) Quantitative association rule (D) None
38. If the item in an association rule reference only one dimension then it is known as _____
(A) Single rule (B) Single dimensional association rule
(C) Single dimensional relative rule (D) Single association rule
39. We refer to the rule set mined as consisting of _________
(A) Association rule (B) Single dimensional association rule
(C) Multilevel Association rule (D) None
40. ________ is a item set where an item set ‘c’ is closed if there exists no proper superset of c.
(A) Fixed closed item set (B) Local closed item set
(C) Frequent closed item set (D) None
41. Priori algorithm also known as ________
(A) Linked algorithm (B) Base level algorithm (C) Level wise algorithm (D) None
42. The partition algorithm is based on ______ that the frequent sets are normally very few in number compared to the set of all item set.
(A) Visualization (B) Classification (C) Observation (D) None
43. Priori algorithm operates on _______________ search method.
(A) Depth first search (B) Breadth first search (C) Boundary first search (D) None
44. MFCS also known as _________
(A) Maximal frequent candidate set (B) Minimal frequent candidate set
(C) Manually frequent candidate set (D) None
45. In dynamic item set counting algorithm.
Which is not a structure.
(A) Dashed box (B) Dashed circle (C) Solid box (D) Fill box
46. The item sets in the solid category structures are not subjected to any ________.
(A) Numbering (B) Counting (C) Observation (D) None
47. Certain item set in the dashed circle move into the _____________
(A) Solid box (B) Base box (C) Dashed box (D) Solid circle
48. In dynamic item set counting algo the item sets that have completed one full pass move from dashed structure to ___________
(A) Solid box (B) Solid circle (C) Entire solid structure (D) None
49. In FP – tree growth algo for instance if there are 10000 frequent 1 – itemsets then how many candidate 2 – item set.
(A) 104 (B) 105 (C) 106 (D) 107
50. FP – tree growth algorithm is known as _____
(A) Feasible pattern tree (B) Fixed pattern tree
(C) Frequent pattern tree (D) None
51. FP – tree is an ______ tree structure.
(A) Informated prefix (B) Extended prefix (C) Extended suffix (D) None
52. Multidimensional association rule with no repeated predicates are called __________
(A) Extra dimension association rule (B) Inter dimension association rule
(C) Intra dimension association rule (D) None
53. Categorical attribute are also called :-
(A) Quantitative (B) Normal (C) Nominal (D) None
54. ARCS is known as ___________
(A) Association rule clock system (B) Association rule closing system
(C) Association rule clustering system (D) None
55. Quantitative attributes can have a very wide range of values defining their ___________
(A) domain (B) Range (C) Set (D) None
56. Which is not a binning ______________
(A) Equiwidth binning (B) Equidepth binning
(C) Heterogeneity based binning (D) Homogeneity – based binning
57. A Non grid based technique has been propose to find _________
(A) Qualitative rule (B) Categorical attribute (C) Quantitative association rule (D) None
58. The correlation between the occurrence of A and B an be measured by computing ________
(A) Corr A.B = (B) CorrA.B =
(C) Corr A.B = (D) None
59. The correlation of A & B is equivalent to P(B/A) / P(B) which is also called the __________ of association rule A B
(A) Lift (B) Left (C) Link (D) None
60. If the Corr A.B resulting value is less than 1 then occurrence of A is _____ with the occurrence B
(A) Negatively correlated (B) Positively correlated (C) Equality correlated (D) None
61. An algorithm that performs a series of walks through item set space is called :-
(A) Relative walk algo (B) Random walk algo (C) Recursive walk algo (D) None
62. The full breadth search where no back ground knowledge of frequent item set is used for pruning is known as _____
(A) Level by level independent (B) Level cross filtering by single item
(C) Level cross filtering by k (D) None
63. ________ removes branches from a fully grown tree.
(A) Pre pruning (B) Equi pruning (C) Post pruning (D) None
64. The cost complexity pruning an example of the ______ approach.
(A) Pre pruning (B) Post pruning (C) Equi pruning (D) None
65. Clustering is a form of _____
(A) Learning by examples (B) Learning by observation
(C) learning by sequences (D) None
66. Which is not a cluster method
(A) Hierarchical (B) Density based (C) Partitions method (D) None
67. PAM is known as ________
(A) Partitioning around medoids (B) Partitioning around means
(C) Partitioning around methods (D) None
68. Squared error criterion is defined as ___________
(A) (B)
(C) (D) None
69. Clustering of a set of objects based on the K – means method, where mean of each cluster is marked by _____
(A) “ + “ (B) “ – ‘ (C) Underscore (D) None
70. In k – medoids method, the medoid can be used, which is most located object in cluster.
(A) Closter (B) Centrally (C) Locally (D) None
71. CLARA is a sampling – based method which is _________
(A) Clustering list application (B) Clustering level application
(C) Clustering large application (D) None
72. CLARA draws _________ of the data set.
(A) Multiple example (B) Multiple view (C) Multiple samples (D) None
73. CLARANS is ________ that combines the sampling technique with PAM.
(A) CLARA based upon recursive search (B) CLARA based upon request search
(C) CLARA based upon redundant search (D) None
74. A Hierarchical clustering methods works by grouping data objects into a ______ of clusters.
(A) Group (B) Aggregation (C) Tree (D) None
75. Which is a hierarchical cluster method.
(A) DBScan method (B) Agglomerative method (C) DD reach able (D) None
76. An object is said to be a core object if _____________
(A) | Ne (o) | < Min Pts (B) | Ne (o) | < Min pts
(C) | Ne (o) | > Min pts (D) | Ne (o) | > Min pts
77. The density – reachability relation is a ___________.
(A) Symmetric (B) Anti symmetric (C) Transitive (D) None
78. Sequential pattern mining can be used to investigate _________ in customer consumption.
(A) Relation (B) Changes (C) Loyalty (D) None
79. A legacy database is a group of ____________ that combines different kinds of data system.
(A) Relational data base (B) Hierarchical
(C) Heterogeneous (D) Homogenous
80. A _________data base usually store relational data that include time related attribute.
(A) Heterogeneous (B) Homogenous
(C) Hierarchical (D) Temporal
PART-B
1. Pivot is also known as ________ that is ________ that is used to provide an alternative presentation of data
(A) View (B) Rotate
(C) Visualization operation (D) Roll up operation
(1) AD (2) AC (3) BD (4) BC
2. True or false :-
(A) The slice operation perform a selection on two or more dimensions
(B) The dice operation define a sub cube by performing a selection on one dimension.
(1) TF (2) FT (3) TT (4) FF
3. True or false :-
(A) Data ware house presents relevant information from which performance can be measured.
(B) Data ware house provide data of customer in consistent view.
(1) TF (2) FT (3) TT (4) FF
4. Drill down can be realized by either ______ the concept hierarchy or introducing ________ dimensions.
(A) Step down (B) Step up (C) Additional (D) Alternate
(1) AD (2) AC (3) BC (4) BD
5. Data warehouse may bring about ____ by training trends, pattern, exception over long periods of time in ________ manner.
(A) Cost comfort (B) Inconsistent (C) Reliable (D) Cost reduction
(1) AC (2) DB (3) DC (4) DA
6. The data source view exposes the information bring ________ captured and by _________.
(A) Operational system (B) Co ordination system (C) Record (D) Stored
(1) AB (2) AC (3) CD (4) DA
7. Data warehouse view includes _________ and ___________.
(A) Visualization table (B) Fact tables (C) Dimension tables (D) Cube table
(1) AD (2) AC (3) BC (4) BD
8. A relational OLAP model is an __________ relational DBMS that maps _____ on multidimensional data to standard relational operations.
(A) Pretend (B) Extended (C) Operation (D) Relations
(1) AC (2) BC (3) CA (D) DB
9. The top tier of architecture of data ware house is a ________ which contains __________
(A) Client (B) Server (C) Reporting tools (D) Pattern tool
(1) AC (2) BC (3) AD (4) BD
10. The top tier is a front end tool and middle tier is _____ and bottom tier is the ________
(A) Analysis tool (B) OLAP server (C) Data warehouse server (D) Data
(1) AC (2) BC (3) BD (4) AD
11. (1) Any subset of frequent set is a frequent set
(2) Any super set of an infrequent set is an infrequent
(A) Down ward closure, up ward closure (B) Down ward closure, frequent closure
(C) Upward closure, down ward closure (D) Upward closure, frequent closure
12. (1) A Frequent set if it is frequent set and no superset of this is a frequent set
(2) If it is not a frequent set, but all its proper subsets are frequent sets.
(A) Maximal frequent set, minimum frequent set (B) Maximal frequent set, border set
(C) Border set, minimum frequent set (D) Border set, Maximal frequent set
13. Priori algo user ______ and it is a ______ approach, moving upward level wise in the lattice.
(A) Upward closure property, up – bottom search
(B) Down ward closure property, up – bottom search
(C) Down ward closure property, bottom – up search
(D) Upward closure property, bottom up search
14. The set of candidate item sets is subjected to ______ to ensure that all the subsets of the candidate sets are already known to ________.
(A) Bottom – up process, frequent set (B) Pruning process, frequent item set
(C) Candidate process, frequent set (D) Upward process, frequent item set
15. Pincer search algo attempts to find to frequents item sets in a _________ manner but same time it maintain a list of _________.
(A) Up – bottom, frequent item set (B) Up – bottom, maximal frequent item set
(C) Bottom – up, frequent item set (D) Bottom – up, maximal frequent item set
16. Item set in the dashed structure have ________ and ______ with them.
(A) Serial no, track (B) Serial no., counter
(C) Counter, stop number (D) Serial no., stop number
17. Frequent item header table consists of two field which are ______ and ________.
(A) Support count (B) Item name (C) Head of node link (D) Root node
(1) AB (2) BC (3) AD (4) BD
18. FP –tree of data set is an ________ tree structure, storing crucial and _____ information about frequent set.
(A) Extend prefix (B) Qualitative (C) Extended suffix (D) Quantitative
(1) AB (2) AD (3) BC (4) CD
19. A frequent pattern tree is a tree structure consisting of an ______ and ____ table.
(A) Item – suffix tree (B) Item prefix tree
(C) Frequent – item header (D) Frequent item link
(1) AC (2) BC (3) AD (4) BD
20. Quantitative attributes are ______ and have an ________ ordering among values.
(A) Implicit (B) Explicit (C) Numeric (D) Character
(1) AC (2) BC (3) AD (4) CD
21. The occurrence of item set A is ____ of the occurrence of item set B is P (AUB) = P(A) P(B) otherwise item set A and B are dependent and ___________
(A) Item set (B) Dependent (C) Independent (D) Correlated
(1) AB (2) AC (3) CD (4) CB
22. The heterogeneous database in a legacy database may be connected by ________ or ______ networks
(A) Inter (B) Intra (C) Extra (D) Pre
(1) AC (2) BD (3) AB (4) AD
23. The time series database stores _____ of values that change with ________.
(A) Time (B) Collection (C) Sequences (D) Random
(1) AB (2) BC (3) DA (4) DB
24. Spatial data may be represented in ________ consisting of n – dimensional __________.
(A) Role map (B) Raster format (C) Co – rastser format (D) Pixel map
(1) AC (2) BD (3) AD (4) AB
25. A set of ______ that describe the objects and the set of _____ that the object can use to communicate with other objects
(A) Variables (B) Methods (C) Keywords (D) Messages
(1) AB (2) AC (3) BD (4) AD
26. The object relational model ______ the basic relational data model by adding the power to handle ________
(A) Expends (B) Describe (C) Complex data (D) Relational data
(1) AC (2) AD (3) AB (4) BD
27. Data cleaning used to ________ noise and inconsistent data and data integration used to ________ data from multiple sources :-
(A) Extend (B) Remove (C) Merge (D) Repeat
(1) AB (2) BC (3) CD (4) AD
28. A decision tree is a ______ like tree structure, where each internal node denotes a test on and leaf node represent _______
(A) Flow – chart (B) Data flow diag. (C) Class (D) Attribute
(1) BD (2) BC (3) AC (4) AD
29. The basic algo. For decision tree induction is a ______ algo that construct decision trees in a _____ recursive divide and conquer manner.
(A) Greedy (B) Top – down (C) Down – to – top (D) Breaking
(1) AC (2) AB (3) BC (4) AD
30. When a query is posed to __________ a meta data dictionary is used to _____ the query into queries appropriate for the individual heterogeneous sites involved.
(A) Client site (B) Server site (C) Compile (D) Translate
(1) AC (2) AD (3) BC (4) BD
31. State True/False :-
(A) OLAP operation make use of background knowledge regarding the domain of data
(B) Relational database system have been widely used in business application.
(1) TT (2) TF (3) FF (4) FT
32. True or false :-
(A) A data mart focuses on selected subject and its scope is department wide
(B) The 3 – 4 – 5 rule can be used to segment numeric data into relatively uniform natural intervals.
(1) TF (B) FT (C) TT (D) FF
33. The DBScan algorithm maintain the set of objects in ________ different categories one of which is ______
(A) Three (B) Four (C) Noise (D) Cluster
(1) AC (2) AD (3) BC (4) BD
34. True or false :-
(A) Each unclassified objects has an associated cluster – id indicate cluster.
(B) Classified objects do not have any cluster – id.
(1) TF (2) FT (3) FF (4) TT
35. CLARANS has been experimentally shown to be more effective than both _____ and _________
(A) SAM (B) CLARA (C) PAM (D) Hierarchy
(1) AB (2) BC (3) BD (4) CD
36. True or false :-
(A) Tree pruning methods address the problem of over fitting the data
(B) The learning and classification steps of decision tree induction are generally slow.
(1) TT (2) FF (3) FT (4) TF
37. True or false :-
(A) The individual tuple making up the training set are referred to as training data set.
(B) The data tuple analyzed to build the model collectively form the training samples.
(1) TT (2) FF (3) FT (4) TF
38. True or false :-
(A) In constraint – based association mining the level constraint specify the dimension of data.
(B) Data constraints specify set of task relevant data.
(1) TT (2) FF (3) FT (4) TF
39. True or false :-
(A) A correlation rule of the ferm where the occurrences of item are correlated.
(B) The transaction can be summarized in a contingency table.
(1) TT (2) FF (3) FT (4) TF
40. True or false :-
(A) A non – grid based technique has been proposed to find qualitative association rule.
(B) A – grid based technique described initial association rule.
(1) TT (2) FF (3) FT (4) TF
PART-C
1. True or false :-
(A) The hybrid OLAP approach combines ROCAP and MOLAP technology.
(B) MOLAP server support multidimensional views of data.
(C) MOLAP server is intermediate server
(D) Virtual warehouse is easy to build but requires excess capacity on operational database servers.
(1) TTTT (2) FFFF (3) TFTF (4) TTFT
2. Match the following :-
(1) ROLAP (A) Relative on line analytical processing
(2) HOLAP (B) Relational on line analytical processing
(C) Homogenous on line analytical processing
(D) Hybrid on line analytical processing
(1) 1 – A, 2 - C (2) 1 – B, 2 – C(3) 1 – A, 2 – B (4) 1 – B, 2 – D
3. True or false :-
(A) Classification, prediction, association and clustering are data mining function.
(B) Data warehouse is a subject – oriented, distributed, time – variant collection of data in support of management’s decision making process.
(C) Data warehouse focuses on the modeling and analysis of data for decision makers.
(D) The major task of on – line operational database system is to perform on – line transaction and every processing.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
4. Match the following :-
(1) DB – design (A) Day – to – day operations
(2) Access (B) Transaction
(3) Function (C) Application oriented
(4) Orientation (D) Read / write
(a) 1 – C, 2 – D, 3 – B, 4 – A (b) 1 – D, 2 – C, 3 – B, 4 – A
(c) 1 – C, 2 – D, 3 – A, 4 – B (d) 1 – A, 2 – C, 3 – D, 4 – B
5. Match the following :-
(1) Characteristic (A) Analyst
(2) Data (B) Historical
(3) View (C) Informational processing
(4) User (D) Multidimensional
(a) 1 – A, 2 – D, 3 – B, 4 – C (b) 1 – C, 2 – B, 3 – D, 4 – A
(c) 1 – A, 2 – D, 3 – C, 4 – B (d) 1 – C, 2 – B, 3 – A, 4 – D
6. True or false :-
(A) Data mining refers to the knowledge mining from data.
(B) Data warehouse server is responsible for fetching the relevant data.
(C) Clustering is used to merge noisy data
(D) Data mining is the process of discovering interesting knowledge from large amount
(1) TTTT (2) FFFF (3) TFTF (4) TTFT
7. True or false :-
(A) Pattern evaluation is a part of architecture of data mining.
(B) API is used to communicate between user and data mining system.
(C) Relational database is a collection of tables.
(D) Graphical user interface is used to communicate between user and data mining system.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
8. True or false :-
(A) Relational data can be accessed by database queries.
(B) A query is transformed into a set of relational operations.
(C) A query allows retrieval of specified subset of the data.
(D) Data warehouse is modeled by multidimensional database structure.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
9. True or false :-
(A) Database scope is limited to the department wide. (B) Data mart is subset of data warehouse
(C) Data mart is limited to data warehouse (D) Roll – up is not an OLAP operation.
(1) TTTT (2) FFFF (3) FTFF (4) TFTT
10. True or false :-
(A) Data mining technique can be used to find the characteristics of object evolution.
(B) Text databases are databases that contain image description of media description.
(C) A temporal database involve several timestamp.
(D) A legacy database is a group of heterogeneous database.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
11. True or false :-
(A) A database may contain outliers object.
(B) Outliers may be detected by using statistical test.
(C) Clustering can also facilitate taxonomy formation
(D) Clustering analyze data objects without consulting a known class label.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
12. True or false :-
(A) Crime detation is associate with data mining.
(B) Design and construction of data warehouse based on the benefits of data mining
(C) The retail industry conducts sales campaigns using advertisements
(D) Customer loyalty and purchase trends is associate with data mining.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
13. True or false :-
(A) Data warehouse is a collection of time – variant data.
(B) Data warehouse focus on the analysis of data.
(C) Data warehouse constructing by breaking a large heterogeneous system.
(D) Data warehouse constructed by using on – line records and relational database.
(1) TTTT (2) FFFF (3) TFTF (4) TTFT
14. True or false :-
(A) Query driven approach in which information from multiple sources is integrate
(B) Update driven approach requires complex information filtering processes.
(C) Data warehouse does not support complex multidimensional queries.
(D) Data warehouse contain the most current information.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
15. True or false :-
(A) OLAP covers the most day – to – day information.
(B) OLTP covers the historical information on data.
(C) Main user of OLAP are DBA, CLERK
(D) Main function of OLTP is decision support.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
16. True or false :-
(A) Dimension are the entities with respect to organization records.
(B) OLTP having index operation
(C) OLAP having lots of scans operations
(D) OLAP having flat relational view.
(1) TTTT (2) TTTF (3) TFTF (4) TFTT
17. True or false :-
(A) Fact table does not have any key.
(B) Dimension table having key.
(C) Lowest level of summarization is O – D cuboid
(D) Highest level of summarization is apex cuboid.
(1) TTTT (2) FFFF (3) FTFT (4) TFTT
18. True or false :-
(A) In stare schema, each dimension is represented by only one table.
(B) Snowflake schema is also known as galaxy schema.
(C) In fact constellation schema allows dimension tables to be shared between fact tables.
(D) SQL can be used to specify relational query.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
19. True or false :-
(A) Removal of data may be seem of data transformed.
(B) Data clearing is a process of smooth out noise and identify outliers.
(C) Noise is variance in measured variable
(D) Binning methods smooth sorted data value by consulting the values around it.
(1) TTTT (2) FFFF (3) FTTT (4) TFTT
20. True or false :-
(A) In smoothing by bin boundaries, each bin values is replaced by closest boundary value.
(B) In Binning the sorted values are distributed into a number of buckets
(C) Values that fall outside of the set of clusters may be considered outlets.
(D) In linear regression more than two variable are involved and data are fit to multidimensional surface.
(1) TTTT (2) FFFF (3) TTFF (4) TFTT
21. True or false :-
(A) Data integration sources include multiple data cubes.
(B) An attribute may be redundant if it can be derived from another table.
(C) Redundancy can be detected by correlation analysis.
(D) Generalization of data in which low level data are replaced by higher level concepts.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
22. True or false :-
(A) An attribute is normalized by scaling.
(B) Normalization is done so that attribute fall within a small specified range.
(C) Min – max normalization performs a linear transformation on the original.
(D) Z – score normalization also called zero – mean normalization.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
23. True or false :-
(A) Data transformed technique use to obtain a reduced representation of data
(B) In data compression, encoding is used
(C) In numerosity reduction, the data are replaced by alternative.
(D) Dimension reduction, in which weakly relevant or redundant dimension detected and removed.
(1) TTTT (2) FFFF (3) FTTT (4) TFTT
24. True or false :-
(A) Data cube created for varying level of abstraction are often referred to as cuboids
(B) Data cube can be refer as lattice
(C) Highest level of abstraction is base cuboids
(D) Lowest level of abstraction is apex cuboids.
(1) TTTT (2) FFFF (3) TTFF (4) TFTT
25. True or false :-
(A) In equi width histogram, the width of each bucket range is uniform.
(B) Equi height histogram the frequency of each bucket is constant.
(C) In an equi depth histogram, the frequency of each bucket is constant.
(D) The V – optimal histogram is the one with the least variance.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
26. True or false :-
(A) Sampling is used as a data normalization technique
(B) If a rule concerns association between presence or absence of item, is Boolean association rule.
(C) If a rule describe association between qualitative item, is quantitative association rule.
(D) Max pattern can be used to reduce the number of frequent item set generated in mining.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
27. True or false :-
(A) In upward closure property any superset of an infrequent set is an infrequent set.
(B) Down ward closure property any subset of a frequent set is a frequent set.
(C) A frequent set is minimal frequent set, if it is a frequent set and no super set of this is a frequent set.
(D) The maximal frequent set can act as a compact representation of the set of all frequent set.
(1) TTTT (2) FFFF (3) TTFT (4) TFTT
28. True or false :-
(A) In equi depth binning, the interval size of each bin is same
(B) In equi width binning each bin has the same number of tuples assigned to it.
(C) Homogeneity based binning in which bin size is determined so the tuples in each bin are uniformaly distributed.
(D) ARCS uses equi width binning.
(1) TTTT (2) FFFF (3) FFTT (4) TFTT
29. True or false :-
(A) Each array cell holds the count distribution of each possible class of the categorical attributes of rule – right hand side.
(B) The disadvantage of correlation is that is upward closed.
(C) Data classification is a three – step process.
(D) Each tuples assumed to belong to a predefined class, called the class level attributes.
(1) TTTT (2) TFFT (3) TFTF (4) TFTT
30. True or false :-
(A) Classification and prediction can be compared by using speed criteria
(B) Clustering is the form of learning by observation.
(C) Iterative relocation technique in which moving object from on group to another.
(D) OPTICS is a hierarchical clustering method that grows clusters according to a threshold level.
(1) TTTF (2) FFFF (3) TFTF (4) TFTT
1. The cuboid that holds the lowest level of summarization is called the ________
(A) D – cubiod (B) Apex – cuboid (C) Base – cuboid (D) None
2. A multidimensional data model is typically organized around a _____________ .
(A) Logical table (B) Central table theme (C) Dimension table (D) None
3. The data cube is a ________ for multidimensional data storage.
(A) Instance (B) Schema (C) Metaphor (D) None
4. The cuboid which holds the highest level of summarization is called _____
(A) 1 – D cuboid (B) High – cuboid (C) Base cuboid (D) Apex – cuboid
5. In star schema, each dimension is represented by only _________that contains a set of ___
(A) One table (B) Many table (C) Property (D) Attribute
(1) A C (2) BC (3) AD (D) AD
6. OLAP provides a ______ environment for interactive data analysis
(A) User – safety (B) User – required (C) User – friendly (D) None
7. Each dimension contain multiple level of ________ defined by concept hierarchies
(A) Behaviour (B) Property (C) Obstraction (D) Non e
8. The data cube that contains all the dimensions into aggregated we refer to this cube as ____.
(A) Central cube (B) Centrel core (C) Base cube (D) None
9. The operation rollup is also known as _____
(A) Roll – pit (B) Drill – pit (C) Drill – up (D) None
10. The rollup operation shown aggregates the data by _________ the concept hierarchy.
(A) Decending (B) Ascending (C) Aggregating (D) None
11. Real word data tend to be noisy and ________.
(A) Consistent (B) Inconsistent (C) Complete (D) None
12. Data cleaning steps attempt to ______ missing values.
(A) Join (B) Complete (C) Fill in (D) None
13. Which approach, of filling the missing values is not feasible and time consuming.
(A) Ignore tuple (B) By manually
(C) Use global constant (D) Use the attributes mean.
14. Noise is a type of _________ in measured variable.
(A) Numeric attribute (B) Random error (C) Technique (D) None
15. In smoothing by bin means each value in a bin is replaced by the _______ value of the bin.
(A) Bin (B) Exact (C) Mean (D) None
16. In smoothing by bin boundaries, the minimum and maximum values in a given bin are identified as the ______ boundaries.
(A) Mean (B) Bin (C) Nearest (D) None
17. Outliers may be detected by __ where similar values are organized into groups
(A) Grouping (B) Joining (C) Clustering (D) None
18. Data can be smoothed by fitting the data to a function such as ____
(A) Aggregation (B) Clustering (C) Regression (D) None
19. Multiple linear regression is an ______ of linear regression where more than two variable are involved and data are bits to multi dimensional surface.
(A) Group (B) Extension (C) Part (D) None
20. Data mining often requires _____ that is merging of data from multiple data stores.
(A) Data aggregation (B) Data integration (C) Data merging (D) None
21. The correlation between attribute A and B can be measured by
(A) (B)
(C) (D)
22. Which is not involve in the data transformation .
(A) Smoothing (B) Aggregation (C) Functional dependence (D) Generalization
23. Z score normalization is also known as ____
(1) Base normalization (2) Zero means normalization (3) I – D mean normalization (4) None
24. Which is not the strategy of data reduction?
(A) Data cube aggregation (B) Dimension reduction
(C) Integrity concept (D) None
25. Data reduction techniques can be use to obtain reduced ______ of data set that is much smaller in volume.
(A) Arrangement (B) Visualization (C) Representation (D) None
26. Stepwise backward elimination procedures starts with the _____ set of attributes.
(A) Empty (B) Full (C) Reduced (D) None
27. Classification and prediction are two forms of ____ that can be used to extract models describing important data.
(A) Data prediction (B) data analysis (C) Data aggregation (D) Data grouping
28. The data tuples analyzed to build the model collectively form the _____
(A) Training data set (B) Training samples (C) Supervised learning (D) None
29. Concept hierarchies may be used for which purpose.
(A) Data cleaning (B) Data transformation (C) Normalization (D) None
30. Which is not the criteria used for classification and prediction method
(A) Speed (B) Predictive accuracy (C) Scalability (D) Reliability
31. In classification by decision tree induction path is traced from the root to __ that hold the class prediction for that sample.
(A) Root (B) Leaf (C) Single branch (D) None
32. When decision trees are built many of the branches may reflect _____ in the training data.
(A) Tree (B) Root (C) Noise (D) None
33. Decision tree starts as a single node representing the __________.
(A) Training data set (B) Training samples (C) Training tuples (D) None
34. The ______ measure is used to select the test attribute at each node in the tree.
(A) Information root (B) Information leaf (C) Information gain (D) None
35. Tree pruning methods address the problems of ________ of the data.
(A) Tree induction (B) Over fitting (C) Statistical reduction (D) None
36. A transaction t is said to support all item I, if I, is present in t ‘t’ is said to support a subset of items _ if ‘t’ support each item I in x.
(A) X A (B) X I (C) A X (D) I X
37. If a rule describe association between quantitative item or attributes then it is known as ________.
(A) Relative rule (B) Interval rule (C) Quantitative association rule (D) None
38. If the item in an association rule reference only one dimension then it is known as _____
(A) Single rule (B) Single dimensional association rule
(C) Single dimensional relative rule (D) Single association rule
39. We refer to the rule set mined as consisting of _________
(A) Association rule (B) Single dimensional association rule
(C) Multilevel Association rule (D) None
40. ________ is a item set where an item set ‘c’ is closed if there exists no proper superset of c.
(A) Fixed closed item set (B) Local closed item set
(C) Frequent closed item set (D) None
41. Priori algorithm also known as ________
(A) Linked algorithm (B) Base level algorithm (C) Level wise algorithm (D) None
42. The partition algorithm is based on ______ that the frequent sets are normally very few in number compared to the set of all item set.
(A) Visualization (B) Classification (C) Observation (D) None
43. Priori algorithm operates on _______________ search method.
(A) Depth first search (B) Breadth first search (C) Boundary first search (D) None
44. MFCS also known as _________
(A) Maximal frequent candidate set (B) Minimal frequent candidate set
(C) Manually frequent candidate set (D) None
45. In dynamic item set counting algorithm.
Which is not a structure.
(A) Dashed box (B) Dashed circle (C) Solid box (D) Fill box
46. The item sets in the solid category structures are not subjected to any ________.
(A) Numbering (B) Counting (C) Observation (D) None
47. Certain item set in the dashed circle move into the _____________
(A) Solid box (B) Base box (C) Dashed box (D) Solid circle
48. In dynamic item set counting algo the item sets that have completed one full pass move from dashed structure to ___________
(A) Solid box (B) Solid circle (C) Entire solid structure (D) None
49. In FP – tree growth algo for instance if there are 10000 frequent 1 – itemsets then how many candidate 2 – item set.
(A) 104 (B) 105 (C) 106 (D) 107
50. FP – tree growth algorithm is known as _____
(A) Feasible pattern tree (B) Fixed pattern tree
(C) Frequent pattern tree (D) None
51. FP – tree is an ______ tree structure.
(A) Informated prefix (B) Extended prefix (C) Extended suffix (D) None
52. Multidimensional association rule with no repeated predicates are called __________
(A) Extra dimension association rule (B) Inter dimension association rule
(C) Intra dimension association rule (D) None
53. Categorical attribute are also called :-
(A) Quantitative (B) Normal (C) Nominal (D) None
54. ARCS is known as ___________
(A) Association rule clock system (B) Association rule closing system
(C) Association rule clustering system (D) None
55. Quantitative attributes can have a very wide range of values defining their ___________
(A) domain (B) Range (C) Set (D) None
56. Which is not a binning ______________
(A) Equiwidth binning (B) Equidepth binning
(C) Heterogeneity based binning (D) Homogeneity – based binning
57. A Non grid based technique has been propose to find _________
(A) Qualitative rule (B) Categorical attribute (C) Quantitative association rule (D) None
58. The correlation between the occurrence of A and B an be measured by computing ________
(A) Corr A.B = (B) CorrA.B =
(C) Corr A.B = (D) None
59. The correlation of A & B is equivalent to P(B/A) / P(B) which is also called the __________ of association rule A B
(A) Lift (B) Left (C) Link (D) None
60. If the Corr A.B resulting value is less than 1 then occurrence of A is _____ with the occurrence B
(A) Negatively correlated (B) Positively correlated (C) Equality correlated (D) None
61. An algorithm that performs a series of walks through item set space is called :-
(A) Relative walk algo (B) Random walk algo (C) Recursive walk algo (D) None
62. The full breadth search where no back ground knowledge of frequent item set is used for pruning is known as _____
(A) Level by level independent (B) Level cross filtering by single item
(C) Level cross filtering by k (D) None
63. ________ removes branches from a fully grown tree.
(A) Pre pruning (B) Equi pruning (C) Post pruning (D) None
64. The cost complexity pruning an example of the ______ approach.
(A) Pre pruning (B) Post pruning (C) Equi pruning (D) None
65. Clustering is a form of _____
(A) Learning by examples (B) Learning by observation
(C) learning by sequences (D) None
66. Which is not a cluster method
(A) Hierarchical (B) Density based (C) Partitions method (D) None
67. PAM is known as ________
(A) Partitioning around medoids (B) Partitioning around means
(C) Partitioning around methods (D) None
68. Squared error criterion is defined as ___________
(A) (B)
(C) (D) None
69. Clustering of a set of objects based on the K – means method, where mean of each cluster is marked by _____
(A) “ + “ (B) “ – ‘ (C) Underscore (D) None
70. In k – medoids method, the medoid can be used, which is most located object in cluster.
(A) Closter (B) Centrally (C) Locally (D) None
71. CLARA is a sampling – based method which is _________
(A) Clustering list application (B) Clustering level application
(C) Clustering large application (D) None
72. CLARA draws _________ of the data set.
(A) Multiple example (B) Multiple view (C) Multiple samples (D) None
73. CLARANS is ________ that combines the sampling technique with PAM.
(A) CLARA based upon recursive search (B) CLARA based upon request search
(C) CLARA based upon redundant search (D) None
74. A Hierarchical clustering methods works by grouping data objects into a ______ of clusters.
(A) Group (B) Aggregation (C) Tree (D) None
75. Which is a hierarchical cluster method.
(A) DBScan method (B) Agglomerative method (C) DD reach able (D) None
76. An object is said to be a core object if _____________
(A) | Ne (o) | < Min Pts (B) | Ne (o) | < Min pts
(C) | Ne (o) | > Min pts (D) | Ne (o) | > Min pts
77. The density – reachability relation is a ___________.
(A) Symmetric (B) Anti symmetric (C) Transitive (D) None
78. Sequential pattern mining can be used to investigate _________ in customer consumption.
(A) Relation (B) Changes (C) Loyalty (D) None
79. A legacy database is a group of ____________ that combines different kinds of data system.
(A) Relational data base (B) Hierarchical
(C) Heterogeneous (D) Homogenous
80. A _________data base usually store relational data that include time related attribute.
(A) Heterogeneous (B) Homogenous
(C) Hierarchical (D) Temporal
PART-B
1. Pivot is also known as ________ that is ________ that is used to provide an alternative presentation of data
(A) View (B) Rotate
(C) Visualization operation (D) Roll up operation
(1) AD (2) AC (3) BD (4) BC
2. True or false :-
(A) The slice operation perform a selection on two or more dimensions
(B) The dice operation define a sub cube by performing a selection on one dimension.
(1) TF (2) FT (3) TT (4) FF
3. True or false :-
(A) Data ware house presents relevant information from which performance can be measured.
(B) Data ware house provide data of customer in consistent view.
(1) TF (2) FT (3) TT (4) FF
4. Drill down can be realized by either ______ the concept hierarchy or introducing ________ dimensions.
(A) Step down (B) Step up (C) Additional (D) Alternate
(1) AD (2) AC (3) BC (4) BD
5. Data warehouse may bring about ____ by training trends, pattern, exception over long periods of time in ________ manner.
(A) Cost comfort (B) Inconsistent (C) Reliable (D) Cost reduction
(1) AC (2) DB (3) DC (4) DA
6. The data source view exposes the information bring ________ captured and by _________.
(A) Operational system (B) Co ordination system (C) Record (D) Stored
(1) AB (2) AC (3) CD (4) DA
7. Data warehouse view includes _________ and ___________.
(A) Visualization table (B) Fact tables (C) Dimension tables (D) Cube table
(1) AD (2) AC (3) BC (4) BD
8. A relational OLAP model is an __________ relational DBMS that maps _____ on multidimensional data to standard relational operations.
(A) Pretend (B) Extended (C) Operation (D) Relations
(1) AC (2) BC (3) CA (D) DB
9. The top tier of architecture of data ware house is a ________ which contains __________
(A) Client (B) Server (C) Reporting tools (D) Pattern tool
(1) AC (2) BC (3) AD (4) BD
10. The top tier is a front end tool and middle tier is _____ and bottom tier is the ________
(A) Analysis tool (B) OLAP server (C) Data warehouse server (D) Data
(1) AC (2) BC (3) BD (4) AD
11. (1) Any subset of frequent set is a frequent set
(2) Any super set of an infrequent set is an infrequent
(A) Down ward closure, up ward closure (B) Down ward closure, frequent closure
(C) Upward closure, down ward closure (D) Upward closure, frequent closure
12. (1) A Frequent set if it is frequent set and no superset of this is a frequent set
(2) If it is not a frequent set, but all its proper subsets are frequent sets.
(A) Maximal frequent set, minimum frequent set (B) Maximal frequent set, border set
(C) Border set, minimum frequent set (D) Border set, Maximal frequent set
13. Priori algo user ______ and it is a ______ approach, moving upward level wise in the lattice.
(A) Upward closure property, up – bottom search
(B) Down ward closure property, up – bottom search
(C) Down ward closure property, bottom – up search
(D) Upward closure property, bottom up search
14. The set of candidate item sets is subjected to ______ to ensure that all the subsets of the candidate sets are already known to ________.
(A) Bottom – up process, frequent set (B) Pruning process, frequent item set
(C) Candidate process, frequent set (D) Upward process, frequent item set
15. Pincer search algo attempts to find to frequents item sets in a _________ manner but same time it maintain a list of _________.
(A) Up – bottom, frequent item set (B) Up – bottom, maximal frequent item set
(C) Bottom – up, frequent item set (D) Bottom – up, maximal frequent item set
16. Item set in the dashed structure have ________ and ______ with them.
(A) Serial no, track (B) Serial no., counter
(C) Counter, stop number (D) Serial no., stop number
17. Frequent item header table consists of two field which are ______ and ________.
(A) Support count (B) Item name (C) Head of node link (D) Root node
(1) AB (2) BC (3) AD (4) BD
18. FP –tree of data set is an ________ tree structure, storing crucial and _____ information about frequent set.
(A) Extend prefix (B) Qualitative (C) Extended suffix (D) Quantitative
(1) AB (2) AD (3) BC (4) CD
19. A frequent pattern tree is a tree structure consisting of an ______ and ____ table.
(A) Item – suffix tree (B) Item prefix tree
(C) Frequent – item header (D) Frequent item link
(1) AC (2) BC (3) AD (4) BD
20. Quantitative attributes are ______ and have an ________ ordering among values.
(A) Implicit (B) Explicit (C) Numeric (D) Character
(1) AC (2) BC (3) AD (4) CD
21. The occurrence of item set A is ____ of the occurrence of item set B is P (AUB) = P(A) P(B) otherwise item set A and B are dependent and ___________
(A) Item set (B) Dependent (C) Independent (D) Correlated
(1) AB (2) AC (3) CD (4) CB
22. The heterogeneous database in a legacy database may be connected by ________ or ______ networks
(A) Inter (B) Intra (C) Extra (D) Pre
(1) AC (2) BD (3) AB (4) AD
23. The time series database stores _____ of values that change with ________.
(A) Time (B) Collection (C) Sequences (D) Random
(1) AB (2) BC (3) DA (4) DB
24. Spatial data may be represented in ________ consisting of n – dimensional __________.
(A) Role map (B) Raster format (C) Co – rastser format (D) Pixel map
(1) AC (2) BD (3) AD (4) AB
25. A set of ______ that describe the objects and the set of _____ that the object can use to communicate with other objects
(A) Variables (B) Methods (C) Keywords (D) Messages
(1) AB (2) AC (3) BD (4) AD
26. The object relational model ______ the basic relational data model by adding the power to handle ________
(A) Expends (B) Describe (C) Complex data (D) Relational data
(1) AC (2) AD (3) AB (4) BD
27. Data cleaning used to ________ noise and inconsistent data and data integration used to ________ data from multiple sources :-
(A) Extend (B) Remove (C) Merge (D) Repeat
(1) AB (2) BC (3) CD (4) AD
28. A decision tree is a ______ like tree structure, where each internal node denotes a test on and leaf node represent _______
(A) Flow – chart (B) Data flow diag. (C) Class (D) Attribute
(1) BD (2) BC (3) AC (4) AD
29. The basic algo. For decision tree induction is a ______ algo that construct decision trees in a _____ recursive divide and conquer manner.
(A) Greedy (B) Top – down (C) Down – to – top (D) Breaking
(1) AC (2) AB (3) BC (4) AD
30. When a query is posed to __________ a meta data dictionary is used to _____ the query into queries appropriate for the individual heterogeneous sites involved.
(A) Client site (B) Server site (C) Compile (D) Translate
(1) AC (2) AD (3) BC (4) BD
31. State True/False :-
(A) OLAP operation make use of background knowledge regarding the domain of data
(B) Relational database system have been widely used in business application.
(1) TT (2) TF (3) FF (4) FT
32. True or false :-
(A) A data mart focuses on selected subject and its scope is department wide
(B) The 3 – 4 – 5 rule can be used to segment numeric data into relatively uniform natural intervals.
(1) TF (B) FT (C) TT (D) FF
33. The DBScan algorithm maintain the set of objects in ________ different categories one of which is ______
(A) Three (B) Four (C) Noise (D) Cluster
(1) AC (2) AD (3) BC (4) BD
34. True or false :-
(A) Each unclassified objects has an associated cluster – id indicate cluster.
(B) Classified objects do not have any cluster – id.
(1) TF (2) FT (3) FF (4) TT
35. CLARANS has been experimentally shown to be more effective than both _____ and _________
(A) SAM (B) CLARA (C) PAM (D) Hierarchy
(1) AB (2) BC (3) BD (4) CD
36. True or false :-
(A) Tree pruning methods address the problem of over fitting the data
(B) The learning and classification steps of decision tree induction are generally slow.
(1) TT (2) FF (3) FT (4) TF
37. True or false :-
(A) The individual tuple making up the training set are referred to as training data set.
(B) The data tuple analyzed to build the model collectively form the training samples.
(1) TT (2) FF (3) FT (4) TF
38. True or false :-
(A) In constraint – based association mining the level constraint specify the dimension of data.
(B) Data constraints specify set of task relevant data.
(1) TT (2) FF (3) FT (4) TF
39. True or false :-
(A) A correlation rule of the ferm where the occurrences of item are correlated.
(B) The transaction can be summarized in a contingency table.
(1) TT (2) FF (3) FT (4) TF
40. True or false :-
(A) A non – grid based technique has been proposed to find qualitative association rule.
(B) A – grid based technique described initial association rule.
(1) TT (2) FF (3) FT (4) TF
PART-C
1. True or false :-
(A) The hybrid OLAP approach combines ROCAP and MOLAP technology.
(B) MOLAP server support multidimensional views of data.
(C) MOLAP server is intermediate server
(D) Virtual warehouse is easy to build but requires excess capacity on operational database servers.
(1) TTTT (2) FFFF (3) TFTF (4) TTFT
2. Match the following :-
(1) ROLAP (A) Relative on line analytical processing
(2) HOLAP (B) Relational on line analytical processing
(C) Homogenous on line analytical processing
(D) Hybrid on line analytical processing
(1) 1 – A, 2 - C (2) 1 – B, 2 – C(3) 1 – A, 2 – B (4) 1 – B, 2 – D
3. True or false :-
(A) Classification, prediction, association and clustering are data mining function.
(B) Data warehouse is a subject – oriented, distributed, time – variant collection of data in support of management’s decision making process.
(C) Data warehouse focuses on the modeling and analysis of data for decision makers.
(D) The major task of on – line operational database system is to perform on – line transaction and every processing.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
4. Match the following :-
(1) DB – design (A) Day – to – day operations
(2) Access (B) Transaction
(3) Function (C) Application oriented
(4) Orientation (D) Read / write
(a) 1 – C, 2 – D, 3 – B, 4 – A (b) 1 – D, 2 – C, 3 – B, 4 – A
(c) 1 – C, 2 – D, 3 – A, 4 – B (d) 1 – A, 2 – C, 3 – D, 4 – B
5. Match the following :-
(1) Characteristic (A) Analyst
(2) Data (B) Historical
(3) View (C) Informational processing
(4) User (D) Multidimensional
(a) 1 – A, 2 – D, 3 – B, 4 – C (b) 1 – C, 2 – B, 3 – D, 4 – A
(c) 1 – A, 2 – D, 3 – C, 4 – B (d) 1 – C, 2 – B, 3 – A, 4 – D
6. True or false :-
(A) Data mining refers to the knowledge mining from data.
(B) Data warehouse server is responsible for fetching the relevant data.
(C) Clustering is used to merge noisy data
(D) Data mining is the process of discovering interesting knowledge from large amount
(1) TTTT (2) FFFF (3) TFTF (4) TTFT
7. True or false :-
(A) Pattern evaluation is a part of architecture of data mining.
(B) API is used to communicate between user and data mining system.
(C) Relational database is a collection of tables.
(D) Graphical user interface is used to communicate between user and data mining system.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
8. True or false :-
(A) Relational data can be accessed by database queries.
(B) A query is transformed into a set of relational operations.
(C) A query allows retrieval of specified subset of the data.
(D) Data warehouse is modeled by multidimensional database structure.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
9. True or false :-
(A) Database scope is limited to the department wide. (B) Data mart is subset of data warehouse
(C) Data mart is limited to data warehouse (D) Roll – up is not an OLAP operation.
(1) TTTT (2) FFFF (3) FTFF (4) TFTT
10. True or false :-
(A) Data mining technique can be used to find the characteristics of object evolution.
(B) Text databases are databases that contain image description of media description.
(C) A temporal database involve several timestamp.
(D) A legacy database is a group of heterogeneous database.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
11. True or false :-
(A) A database may contain outliers object.
(B) Outliers may be detected by using statistical test.
(C) Clustering can also facilitate taxonomy formation
(D) Clustering analyze data objects without consulting a known class label.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
12. True or false :-
(A) Crime detation is associate with data mining.
(B) Design and construction of data warehouse based on the benefits of data mining
(C) The retail industry conducts sales campaigns using advertisements
(D) Customer loyalty and purchase trends is associate with data mining.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
13. True or false :-
(A) Data warehouse is a collection of time – variant data.
(B) Data warehouse focus on the analysis of data.
(C) Data warehouse constructing by breaking a large heterogeneous system.
(D) Data warehouse constructed by using on – line records and relational database.
(1) TTTT (2) FFFF (3) TFTF (4) TTFT
14. True or false :-
(A) Query driven approach in which information from multiple sources is integrate
(B) Update driven approach requires complex information filtering processes.
(C) Data warehouse does not support complex multidimensional queries.
(D) Data warehouse contain the most current information.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
15. True or false :-
(A) OLAP covers the most day – to – day information.
(B) OLTP covers the historical information on data.
(C) Main user of OLAP are DBA, CLERK
(D) Main function of OLTP is decision support.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
16. True or false :-
(A) Dimension are the entities with respect to organization records.
(B) OLTP having index operation
(C) OLAP having lots of scans operations
(D) OLAP having flat relational view.
(1) TTTT (2) TTTF (3) TFTF (4) TFTT
17. True or false :-
(A) Fact table does not have any key.
(B) Dimension table having key.
(C) Lowest level of summarization is O – D cuboid
(D) Highest level of summarization is apex cuboid.
(1) TTTT (2) FFFF (3) FTFT (4) TFTT
18. True or false :-
(A) In stare schema, each dimension is represented by only one table.
(B) Snowflake schema is also known as galaxy schema.
(C) In fact constellation schema allows dimension tables to be shared between fact tables.
(D) SQL can be used to specify relational query.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
19. True or false :-
(A) Removal of data may be seem of data transformed.
(B) Data clearing is a process of smooth out noise and identify outliers.
(C) Noise is variance in measured variable
(D) Binning methods smooth sorted data value by consulting the values around it.
(1) TTTT (2) FFFF (3) FTTT (4) TFTT
20. True or false :-
(A) In smoothing by bin boundaries, each bin values is replaced by closest boundary value.
(B) In Binning the sorted values are distributed into a number of buckets
(C) Values that fall outside of the set of clusters may be considered outlets.
(D) In linear regression more than two variable are involved and data are fit to multidimensional surface.
(1) TTTT (2) FFFF (3) TTFF (4) TFTT
21. True or false :-
(A) Data integration sources include multiple data cubes.
(B) An attribute may be redundant if it can be derived from another table.
(C) Redundancy can be detected by correlation analysis.
(D) Generalization of data in which low level data are replaced by higher level concepts.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
22. True or false :-
(A) An attribute is normalized by scaling.
(B) Normalization is done so that attribute fall within a small specified range.
(C) Min – max normalization performs a linear transformation on the original.
(D) Z – score normalization also called zero – mean normalization.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
23. True or false :-
(A) Data transformed technique use to obtain a reduced representation of data
(B) In data compression, encoding is used
(C) In numerosity reduction, the data are replaced by alternative.
(D) Dimension reduction, in which weakly relevant or redundant dimension detected and removed.
(1) TTTT (2) FFFF (3) FTTT (4) TFTT
24. True or false :-
(A) Data cube created for varying level of abstraction are often referred to as cuboids
(B) Data cube can be refer as lattice
(C) Highest level of abstraction is base cuboids
(D) Lowest level of abstraction is apex cuboids.
(1) TTTT (2) FFFF (3) TTFF (4) TFTT
25. True or false :-
(A) In equi width histogram, the width of each bucket range is uniform.
(B) Equi height histogram the frequency of each bucket is constant.
(C) In an equi depth histogram, the frequency of each bucket is constant.
(D) The V – optimal histogram is the one with the least variance.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
26. True or false :-
(A) Sampling is used as a data normalization technique
(B) If a rule concerns association between presence or absence of item, is Boolean association rule.
(C) If a rule describe association between qualitative item, is quantitative association rule.
(D) Max pattern can be used to reduce the number of frequent item set generated in mining.
(1) TTTT (2) FFFF (3) TFTF (4) TFTT
27. True or false :-
(A) In upward closure property any superset of an infrequent set is an infrequent set.
(B) Down ward closure property any subset of a frequent set is a frequent set.
(C) A frequent set is minimal frequent set, if it is a frequent set and no super set of this is a frequent set.
(D) The maximal frequent set can act as a compact representation of the set of all frequent set.
(1) TTTT (2) FFFF (3) TTFT (4) TFTT
28. True or false :-
(A) In equi depth binning, the interval size of each bin is same
(B) In equi width binning each bin has the same number of tuples assigned to it.
(C) Homogeneity based binning in which bin size is determined so the tuples in each bin are uniformaly distributed.
(D) ARCS uses equi width binning.
(1) TTTT (2) FFFF (3) FFTT (4) TFTT
29. True or false :-
(A) Each array cell holds the count distribution of each possible class of the categorical attributes of rule – right hand side.
(B) The disadvantage of correlation is that is upward closed.
(C) Data classification is a three – step process.
(D) Each tuples assumed to belong to a predefined class, called the class level attributes.
(1) TTTT (2) TFFT (3) TFTF (4) TFTT
30. True or false :-
(A) Classification and prediction can be compared by using speed criteria
(B) Clustering is the form of learning by observation.
(C) Iterative relocation technique in which moving object from on group to another.
(D) OPTICS is a hierarchical clustering method that grows clusters according to a threshold level.
(1) TTTF (2) FFFF (3) TFTF (4) TFTT
Comments
Post a Comment