Buch, Englisch, 236 Seiten, PB, Format (B × H): 170 mm x 240 mm, Gewicht: 354 g
Reihe: Berichte aus der Informatik
Buch, Englisch, 236 Seiten, PB, Format (B × H): 170 mm x 240 mm, Gewicht: 354 g
Reihe: Berichte aus der Informatik
ISBN: 978-3-8440-2145-5
Verlag: Shaker
Because of costs and scarcity, datasets are often highly imbalanced, with a large majority class and a far smaller minority class. Typical examples of imbalanced datasets are healthy versus diseased tissue measurements, lawful versus criminal banking transactions, and correctly priced versus mispriced financial instruments. Constructing classifiers from imbalanced data presents significant theoretical and practical challenges. Validation is also affected by imbalance, as a trivial classifier that ignores its input and always predicts the majority class will appear prescient. This presentation surveys class imbalance from a conceptual perspective, and empirically investigates several RapidMiner approaches to constructing classifiers from imbalanced data. Finally, the presentation describes a set of broadly applicable RapidMiner processes that detect, construct, and evaluate classifiers with imbalanced data.