Meta-Algorithmics: Patterns for Robust, Low Cost, High Quality Systems
Steven J. Simske
Format: PDF / Kindle (mobi) / ePub
The confluence of cloud computing, parallelism and advanced machine intelligence approaches has created a world in which the optimum knowledge system will usually be architected from the combination of two or more knowledge-generating systems. There is a need, then, to provide a reusable, broadly-applicable set of design patterns to empower the intelligent system architect to take advantage of this opportunity.
This book explains how to design and build intelligent systems that are optimized for changing system requirements (adaptability), optimized for changing system input (robustness), and optimized for one or more other important system parameters (e.g., accuracy, efficiency, cost). It provides an overview of traditional parallel processing which is shown to consist primarily of task and component parallelism; before introducing meta-algorithmic parallelism which is based on combining two or more algorithms, classification engines or other systems.
- Explains the entire roadmap for the design, testing, development, refinement, deployment and statistics-driven optimization of building systems for intelligence
- Offers an accessible yet thorough overview of machine intelligence, in addition to having a strong image processing focus
- Contains design patterns for parallelism, especially meta-algorithmic parallelism – simply conveyed, reusable and proven effective that can be readily included in the toolbox of experts in analytics, system architecture, big data, security and many other science and engineering disciplines
- Connects algorithms and analytics to parallelism, thereby illustrating a new way of designing intelligent systems compatible with the tremendous changes in the computing world over the past decade
- Discusses application of the approaches to a wide number of fields; primarily, document understanding, image understanding, biometrics and security printing
- Companion website contains sample code and data sets
reconstructed information are immediately fed back to change the gain—for example, weights—on the final system. A longer-viewed third-order meta-algorithmic pattern is the Proof by Task Completion pattern, which dynamically changes the weighting of the individual knowledge-generating algorithms, systems, or engines after tasks have successfully completed. This approach allows infinite scalability (new data does not change the complexity or storage needs of the metaalgorithmic pattern), and a
differentially. There will be much more to say on this in later sections of the book. Self-organizing feature maps (SOMs) (Kohonen, 1982) are an unsupervised learning approach to describe the topology of input data. This machine learning approach can be used to provide a lower-dimensional representation of the data in addition to its density estimation value. It should be noted that SOMs can be very useful at the front end of unsupervised clustering problems, as well. As will be discussed in
for an N-node set of locations is given by NPP = (N − 1)! . 2 As N increases to more than a trivial problem space, the number of pathways becomes unwieldy for the exhaustive search approach. As a consequence, effective means for searching a subset of the possible pathways must be selected. In Simske and Matthews (2004), several methods for selecting the next node in a pathway were given. One was termed the “lowest remaining distance” strategy, which is also known as the greedy search approach
s elapse). In 4.7 s, 70.5% of the data can therefore be transmitted and analyzed. In general, in T seconds, 15% × (T – 0.3 s) of the processing can occur on the back end. Meanwhile, on the mobile device, in T seconds, 8% × T of the processing can occur. Thus, in 5 s, 40% of the processing can occur on the device. The relative ratio is thus 70.5%/40% = 1.7625. A close approximation to this ratio can be obtained by dividing the data in 14 partitions and assigning 9 of these (64.3%) to the back end
1018 probability of guessing a correct identifier. So, BSL = ceiling(log2 1018 ) = ceiling(59.795) = 60; that is, 60 bits are required to provide the required security level for an individual item. Since there are 108 printed items, not 1, BSCL .= ceiling(log2 108 ) = ceiling(26.575) = 27. Thus, another 27 bits are required to provide sufficient additional variability for all of the items associated with the related set of products. Thus, if we can embed 87 bits into our one or more security