Stream-Based Active Learning in Changing Environments under Verification Latency
T. Pham. Organic Computing -- Doctoral Dissertation Colloquium 2021, kassel university press, (2022)
Labeling data often requires human interaction, which is slow and costly depending on the required expertise.
Stream-based active learning reduces the cost of labeling for data stream mining by only querying the labels for instances that improve the classifier the most. However, state-of-the-art stream-based active learning strategies assume that verification latency, i.e., the time between a query and the availability of the queried label, is negligible. In reality, this is usually not the case as the experts who provide such labels are not available all the time, and the labeling process itself requires time.
This article investigates the impact of verification latency within a stream-based active learning scenario for changing data streams. We focus on two different types of changes that occur over time, i.e., changes in the number of features and changes in the distribution of instances and/or labels. In particular, we identify challenges that occur due to verification latency and propose ideas to overcome these problems within a stream-based active learning strategy.