Knowledge

Convolutional neural network

Source 📝

2804:
important for fine-grained image classification that needs more discriminating features. Meanwhile, another advantage of the CMP operation is to make the channel number of feature maps smaller before it connects to the first fully connected (FC) layer. Similar to the MP operation, we denote the input feature maps and output feature maps of a CMP layer as F ∈ R(C×M×N) and C ∈ R(c×M×N), respectively, where C and c are the channel numbers of the input and output feature maps, M and N are the widths and the height of the feature maps, respectively. Note that the CMP operation only changes the channel number of the feature maps. The width and the height of the feature maps are not changed, which is different from the MP operation.
2137: 1356: 2337: 1664:"C-layer": a downsampling layer that contain units whose receptive fields cover patches of previous convolutional layers. Such a unit typically computes a weighted average of the activations of the units in its patch, and applies inhibition (divisive normalization) pooled from a somewhat larger patch and across different filters in a layer, and applies a saturating activation function. The patch weights are nonnegative and are not trainable in the original neocognitron. The downsampling and competitive inhibition help to classify features and objects in visual scenes even when the objects are shifted. 2613: 4997:
small or thin, such as a small ant on a stem of a flower or a person holding a quill in their hand. They also have trouble with images that have been distorted with filters, an increasingly common phenomenon with modern digital cameras. By contrast, those kinds of images rarely trouble humans. Humans, however, tend to have trouble with other issues. For example, they are not good at classifying objects into fine-grained categories such as the particular breed of dog or species of bird, whereas convolutional neural networks handle this.
1268: 1852: 2656:) in a CNN architecture. While pooling layers contribute to local translation invariance, they do not provide global translation invariance in a CNN, unless a form of global pooling is used. The pooling layer commonly operates independently on every depth, or slice, of the input and resizes it spatially. A very common form of max pooling is a layer with filters of size 2×2, applied with a stride of 2, which subsamples every depth slice in the input by 2 along both width and height, discarding 75% of the activations: 1722:, and demonstrated it on a speech recognition task. They also pointed out that as a data-trainable system, convolution is essentially equivalent to correlation since reversal of the weights does not affect the final learned function ("For convenience, we denote * as correlation instead of convolution. Note that convolving a(t) with b(t) is equivalent to correlating a(-t) with b(t)."). Modern CNN implementations typically do correlation and call it convolution, for convenience, as they did here. 1281:(AlexNet image size should be 227×227×3, instead of 224×224×3, so the math will come out right. The original paper said different numbers, but Andrej Karpathy, the head of computer vision at Tesla, said it should be 227×227×3 (he said Alex did not describe why he put 224×224×3). The next convolution should be 11×11 with stride 4: 55×55×96 (instead of 54×54×96). It would be calculated, for example, as: + 1 = + 1 = 55. Since the kernel output is the same length as width, its area is 55×55.) 5181:
convolutional neural networks to effectively learn time series dependences. Convolutions can be implemented more efficiently than RNN-based solutions, and they do not suffer from vanishing (or exploding) gradients. Convolutional networks can provide an improved forecasting performance when there are multiple similar time series to learn from. CNNs can also be applied to further tasks in time series analysis (e.g., time series classification or quantile forecasting).
12701: 12681: 1460:. Thus, in each convolutional layer, each neuron takes input from a larger area in the input than previous layers. This is due to applying the convolution over and over, which takes the value of a pixel into account, as well as its surrounding pixels. When using dilated layers, the number of pixels in the receptive field remains constant, but the field is more sparsely populated as its dimensions grow when combining the effect of several layers. 4384: 4153: 2067: 32: 2791: 1770:
Zhang et al. (1988) used back-propagation to train the convolution kernels of a CNN for alphabets recognition. The model was called shift-invariant pattern recognition neural network before the name CNN was coined later in the early 1990s. Wei Zhang et al. also applied the same CNN without the last fully connected layer for medical image object segmentation (1991) and breast cancer detection in mammograms (1994).
2191: 1968: 4923:
higher-level entity (e.g. face) is present when the lower-level (e.g. nose and mouth) agree on its prediction of the pose. The vectors of neuronal activity that represent pose ("pose vectors") allow spatial transformations modeled as linear operations that make it easier for the network to learn the hierarchy of visual entities and generalize across viewpoints. This is similar to the way the human
2124: 2621: 4333:
does not imply translation invariance, as the fully connected layers are not invariant to shifts of the input. One solution for complete translation invariance is avoiding any down-sampling throughout the network and applying global average pooling at the last layer. Additionally, several other partial solutions have been proposed, such as
4732:
be spared for later testing, two approaches are to either generate new data from scratch (if possible) or perturb existing data to create new ones. The latter one is used since mid-1990s. For example, input images can be cropped, rotated, or rescaled to create new examples with the same labels as the original training set.
1661:"S-layer": a shared-weights receptive-field layer, later known as a convolutional layer, which contains units whose receptive fields cover a patch of the previous layer. A shared-weights receptive-field group (a "plane" in neocognitron terminology) is often called a filter, and a layer typically has several such filters. 4996:
of object detection to 0.439329, and reduced classification error to 0.06656, the best result to date. Its network applied more than 30 layers. That performance of convolutional neural networks on the ImageNet tests was close to that of humans. The best algorithms still struggle with objects that are
4788:
L1 regularization is also common. It makes the weight vectors sparse during optimization. In other words, neurons with L1 regularization end up using only a sparse subset of their most important inputs and become nearly invariant to the noisy inputs. L1 with L2 regularization can be combined; this is
4731:
Because the degree of model overfitting is determined by both its power and the amount of training it receives, providing a convolutional network with more training examples can reduce overfitting. Because there is often not enough available data to train, especially considering that some part should
2643:
Intuitively, the exact location of a feature is less important than its rough location relative to other features. This is the idea behind the use of pooling in convolutional neural networks. The pooling layer serves to progressively reduce the spatial size of the representation, to reduce the number
2570:
A parameter sharing scheme is used in convolutional layers to control the number of free parameters. It relies on the assumption that if a patch feature is useful to compute at some spatial position, then it should also be useful to compute at other positions. Denoting a single 2-dimensional slice of
2331:
Sometimes, it is convenient to pad the input with zeros (or other values, such as the average of the region) on the border of the input volume. The size of this padding is a third hyperparameter. Padding provides control of the output volume's spatial size. In particular, sometimes it is desirable to
1859:
The feed-forward architecture of convolutional neural networks was extended in the neural abstraction pyramid by lateral and feedback connections. The resulting recurrent convolutional network allows for the flexible incorporation of contextual information to iteratively resolve local ambiguities. In
1831:
A shift-invariant neural network was proposed by Wei Zhang et al. for image character recognition in 1988. It is a modified Neocognitron by keeping only the convolutional interconnections between the image feature layers and the last fully connected layer. The model was trained with back-propagation.
5009:
Compared to image data domains, there is relatively little work on applying CNNs to video classification. Video is more complex than images since it has another (temporal) dimension. However, some extensions of CNNs into the video domain have been explored. One approach is to treat space and time as
4914:
An earlier common way to deal with this problem is to train the network on transformed data in different orientations, scales, lighting, etc. so that the network can cope with these variations. This is computationally intensive for large data-sets. The alternative is to use a hierarchy of coordinate
4784:
L2 regularization is the most common form of regularization. It can be implemented by penalizing the squared magnitude of all parameters directly in the objective. The L2 regularization has the intuitive interpretation of heavily penalizing peaky weight vectors and preferring diffuse weight vectors.
4760:
Another simple way to prevent overfitting is to limit the number of parameters, typically by limiting the number of hidden units in each layer or limiting network depth. For convolutional networks, the filter size also affects the number of parameters. Limiting the number of parameters restricts the
4225:
Padding is the addition of (typically) 0-valued pixels on the borders of an image. This is done so that the border pixels are not undervalued (lost) from the output because they would ordinarily participate in only a single receptive field instance. The padding applied is typically one less than the
2340:
Three example padding conditions. Replication condition means that the pixel outside is padded with the closest pixel inside. The reflection padding is where the pixel outside is padded with the pixel inside, reflected across the boundary of the image. The circular padding is where the pixel outside
2175:
Stacking the activation maps for all filters along the depth dimension forms the full output volume of the convolution layer. Every entry in the output volume can thus also be interpreted as an output of a neuron that looks at a small region in the input. Each entry in an activation map use the same
1785:
In 1990 Yamaguchi et al. introduced the concept of max pooling, a fixed filtering operation that calculates and propagates the maximum value of a given region. They did so by combining TDNNs with max pooling to realize a speaker-independent isolated word recognition system. In their system they used
1738:
et al. for phoneme recognition and was one of the first convolutional networks, as it achieved shift-invariance. A TDNN is a 1-D convolutional neural net where the convolution is performed along the time axis of the data. It is the first CNN utilizing weight sharing in combination with a training by
10594: 4678:
DropConnect is similar to dropout as it introduces dynamic sparsity within the model, but differs in that the sparsity is on the weights, rather than the output vectors of a layer. In other words, the fully connected layer with DropConnect becomes a sparsely connected layer in which the connections
4332:
of the input signal While, in principle, CNNs are capable of implementing anti-aliasing filters, it has been observed that this does not happen in practice and yield models that are not equivariant to translations. Furthermore, if a CNN makes use of fully connected layers, translation equivariance
2803:
A channel max pooling (CMP) operation layer conducts the MP operation along the channel side among the corresponding positions of the consecutive feature maps for the purpose of redundant information elimination. The CMP makes the significant features gather together within fewer channels, which is
2252:
of the output volume controls the number of neurons in a layer that connect to the same region of the input volume. These neurons learn to activate for different features in the input. For example, if the first convolutional layer takes the raw image as input, then different neurons along the depth
2021:
Shared weights: In CNNs, each filter is replicated across the entire visual field. These replicated units share the same parameterization (weight vector and bias) and form a feature map. This means that all the neurons in a given convolutional layer respond to the same feature within their specific
1902:
trained deep feedforward networks on GPUs. In 2011, they extended this to CNNs, accelerating by 60 compared to training CPU. In 2011, the network win an image recognition contest where they achieved superhuman performance for the first time. Then they won more competitions and achieved state of the
1372:
neuron in the second layer. Convolution reduces the number of free parameters, allowing the network to be deeper. For example, using a 5 × 5 tiling region, each with the same shared weights, requires only 25 neurons. Using regularized weights over fewer parameters avoids the vanishing gradients and
2597:
Sometimes, the parameter sharing assumption may not make sense. This is especially the case when the input images to a CNN have some specific centered structure; for which we expect completely different features to be learned on different spatial locations. One practical example is when the inputs
1472:
Each neuron in a neural network computes an output value by applying a specific function to the input values received from the receptive field in the previous layer. The function that is applied to the input values is determined by a vector of weights and a bias (typically real numbers). Learning
1463:
To manipulate the receptive field size as desired, there are some alternatives to the standard convolutional layer. For example, atrous or dilated convolution expands the receptive field size without increasing the number of parameters by interleaving visible and blind regions. Moreover, a single
4922:
Thus, one way to represent something is to embed the coordinate frame within it. This allows large features to be recognized by using the consistency of the poses of their parts (e.g. nose and mouth poses make a consistent prediction of the pose of the whole face). This approach ensures that the
1769:
et al. (1989) used back-propagation to learn the convolution kernel coefficients directly from images of hand-written numbers. Learning was thus fully automatic, performed better than manual coefficient design, and was suited to a broader range of image recognition problems and image types. Wei
5000:
In 2015, a many-layered CNN demonstrated the ability to spot faces from a wide range of angles, including upside down, even when partially occluded, with competitive performance. The network was trained on a database of 200,000 images that included faces at various angles and orientations and a
4910:
Pooling loses the precise spatial relationships between high-level parts (such as nose and mouth in a face image). These relationships are needed for identity recognition. Overlapping the pools so that each feature occurs in multiple pools, helps retain the information. Translation alone cannot
1422:
Convolutional networks may include local and/or global pooling layers along with traditional convolutional layers. Pooling layers reduce the dimensions of data by combining the outputs of neuron clusters at one layer into a single neuron in the next layer. Local pooling combines small clusters,
1367:
can be used to learn features and classify data, this architecture is generally impractical for larger inputs (e.g., high-resolution images), which would require massive numbers of neurons because each pixel is a relevant input feature. A fully connected layer for an image of size 100 × 100 has
1212:
data. Typical ways of regularization, or preventing overfitting, include: penalizing parameters during training (such as weight decay) or trimming connectivity (skipped connections, dropout, etc.) Robust datasets also increase the probability that CNNs will learn the generalized principles that
1746:
TDNNs are convolutional networks that share weights along the temporal dimension. They allow speech signals to be processed time-invariantly. In 1990 Hampshire and Waibel introduced a variant that performs a two-dimensional convolution. Since these TDNNs operated on spectrograms, the resulting
5180:
Recurrent neural networks are generally considered the best neural network architectures for time series forecasting (and sequence modeling in general), but recent studies show that convolutional networks can perform comparably or even better. Dilated convolutions might enable one-dimensional
2198:
When dealing with high-dimensional inputs such as images, it is impractical to connect neurons to all neurons in the previous volume because such a network architecture does not take the spatial structure of the data into account. Convolutional networks exploit spatially local correlation by
2037:, feature maps are divided into rectangular sub-regions, and the features in each rectangle are independently down-sampled to a single value, commonly by taking their average or maximum value. In addition to reducing the sizes of feature maps, the pooling operation grants a degree of local 1683:
In a variant of the neocognitron called the cresceptron, instead of using Fukushima's spatial averaging with inhibition and saturation, J. Weng et al. in 1993 introduced a method called max-pooling where a downsampling unit computes the maximum of the activations of the units in its patch.
4311:
Dilation involves ignoring pixels within a kernel. This reduces processing/memory potentially without significant signal loss. A dilation of 2 on a 3x3 kernel expands the kernel to 5x5, while still processing 9 (evenly spaced) pixels. Accordingly, dilation of 4 expands the kernel to 7x7.
2598:
are faces that have been centered in the image: we might expect different eye-specific or hair-specific features to be learned in different parts of the image. In that case it is common to relax the parameter sharing scheme, and instead simply call the layer a "locally connected layer".
9783: 5122:
using co-evolution. The learning process did not use prior human professional games, but rather focused on a minimal set of information contained in the checkerboard: the location and type of pieces, and the difference in number of pieces between the two sides. Ultimately, the program
3398:. That is, it applies global max pooling, then applies max pooling to the image divided into 4 equal parts, then 16, etc. The results are then concatenated. It is a hierarchical form of global pooling, and similar to global pooling, it is often used just before a classification head. 1979:, images are only of size 32×32×3 (32 wide, 32 high, 3 color channels), so a single fully connected neuron in the first hidden layer of a regular neural network would have 32*32*3 = 3,072 weights. A 200×200 image, however, would lead to neurons that have 200*200*3 = 120,000 weights. 4952:
was reported. Another paper on using CNN for image classification reported that the learning process was "surprisingly fast"; in the same paper, the best published results as of 2011 were achieved in the MNIST database and the NORB database. Subsequently, a similar CNN called
4911:
extrapolate the understanding of geometric relationships to a radically new viewpoint, such as a different orientation or scale. On the other hand, people are very good at extrapolating; after seeing a new shape once they can recognize it from a different viewpoint.
2005:: width, height and depth. Where each neuron inside a convolutional layer is connected to only a small region of the layer before it, called a receptive field. Distinct types of layers, both locally and completely connected, are stacked to form a CNN architecture. 1832:
The training algorithm was further improved in 1991 to improve its generalization ability. The model architecture was modified by removing the last fully connected layer and applied for medical image segmentation (1991) and automatic detection of breast cancer in
5078:. The system trains directly on 3-dimensional representations of chemical interactions. Similar to how image recognition networks learn to compose smaller, spatially proximate features into larger, complex structures, AtomNet discovers chemical features, such as 2561:
ensures that the input volume and output volume will have the same size spatially. However, it is not always completely necessary to use all of the neurons of the previous layer. For example, a neural network designer may decide to use just a portion of padding.
4250:
with pixel position is kept roughly constant across layers. Preserving more information about the input would require keeping the total number of activations (number of feature maps times number of pixel positions) non-decreasing from one layer to the next.
2119:
A CNN architecture is formed by a stack of distinct layers that transform the input volume into an output volume (e.g. holding the class scores) through a differentiable function. A few distinct types of layers are commonly used. These are further discussed
10586: 4983:
is a benchmark in object classification and detection, with millions of images and hundreds of object classes. In the ILSVRC 2014, a large-scale visual recognition challenge, almost every highly ranked team used CNN as their basic framework. The winner
4510:, so that a reduced network is left; incoming and outgoing edges to a dropped-out node are also removed. Only the reduced network is trained on the data in that stage. The removed nodes are then reinserted into the network with their original weights. 10035: 5225:. A common technique is to train the network on a larger data set from a related domain. Once the network parameters have converged an additional training step is performed using the in-domain data to fine-tune the network weights, this is known as 3387:
by taking the average over the height and width. It was first proposed in Network-in-Network. Similarly for Global Max Pooling, or other forms of poolings. It is often used just before the final fully connected layers in a CNN classification head.
3987:
After several convolutional and max pooling layers, the final classification is done via fully connected layers. Neurons in a fully connected layer have connections to all activations in the previous layer, as seen in regular (non-convolutional)
9775: 7853: 1613:. Neighboring cells have similar and overlapping receptive fields. Receptive field size and location varies systematically across the cortex to form a complete map of visual space. The cortex in each hemisphere represents the contralateral 10891: 5364:: A deep learning toolkit written by Microsoft with several unique features enhancing scalability over multiple nodes. It supports full-fledged interfaces for training in C++ and Python and with additional support for model inference in 5392:
library. Allows user to write symbolic mathematical expressions, then automatically generates their derivatives, saving the user from having to code gradients or backpropagation. These symbolic expressions are automatically compiled to
1890:
The first GPU-implementation of a CNN was described in 2006 by K. Chellapilla et al. Their implementation was 4 times faster than an equivalent implementation on CPU. In the same period, GPUs were also used for unsupervised training of
1501:
A deconvolutional layer is the transpose of a convolutional layer. Specifically, a convolutional layer can be written as a multiplication with a matrix, and a deconvolutional layer is multiplication with the transpose of that matrix.
5773:
Convolutional neural networks represent deep learning architectures that are currently used in a wide range of applications, including computer vision, speech recognition, malware dedection, time series analysis in finance, and many
8518:
2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and
4781:) of the weight vector, to the error at each node. The level of acceptable model complexity can be reduced by increasing the proportionality constant('alpha' hyperparameter), thus increasing the penalty for large weight vectors. 11517: 5155:
Fuego 1.1 in a fraction of the time it took Fuego to play. Later it was announced that a large 12-layer convolutional neural network had correctly predicted the professional move in 55% of positions, equalling the accuracy of a
1940:. A notable development is a parallelization method for training convolutional neural networks on the Intel Xeon Phi, named Controlled Hogwild with Arbitrary Order of Synchronization (CHAOS). CHAOS exploits both the thread- and 1816:) digitized in 32x32 pixel images. The ability to process higher-resolution images requires larger and more layers of convolutional neural networks, so this technique is constrained by the availability of computing resources. 3215: 1451:
In neural networks, each neuron receives input from some number of locations in the previous layer. In a convolutional layer, each neuron receives input from only a restricted area of the previous layer called the neuron's
1307:. As the convolution kernel slides along the input matrix for the layer, the convolution operation generates a feature map, which in turn contributes to the input of the next layer. This is followed by other layers such as 1347:
Convolutional layers convolve the input and pass its result to the next layer. This is similar to the response of a neuron in the visual cortex to a specific stimulus. Each convolutional neuron processes data only for its
10418: 6085:
Tsantekidis, Avraam; Passalis, Nikolaos; Tefas, Anastasios; Kanniainen, Juho; Gabbouj, Moncef; Iosifidis, Alexandros (July 2017). "Forecasting Stock Prices from the Limit Order Book Using Convolutional Neural Networks".
5306:(CDBN) have structure very similar to convolutional neural networks and are trained similarly to deep belief networks. Therefore, they exploit the 2D structure of images, like CNNs do, and make use of pre-training like 2008:
Local connectivity: following the concept of receptive fields, CNNs exploit spatial locality by enforcing a local connectivity pattern between neurons of adjacent layers. The architecture thus ensures that the learned
1997:. These models mitigate the challenges posed by the MLP architecture by exploiting the strong spatially local correlation present in natural images. As opposed to MLPs, CNNs have the following distinguishing features: 2017:
that become increasingly global (i.e. responsive to a larger region of pixel space) so that the network first creates representations of small parts of the input, then from them assembles representations of larger
1568: 2912: 5010:
equivalent dimensions of the input and perform convolutions in both time and space. Another way is to fuse the features of two convolutional neural networks, one for the spatial and one for the temporal stream.
5160:
human player. When the trained convolutional network was used directly to play games of Go, without any search, it beat the traditional search program GNU Go in 97% of games, and matched the performance of the
4751:
One of the simplest methods to prevent overfitting of a network is to simply stop the training before overfitting has had a chance to occur. It comes with the disadvantage that the learning process is halted.
9996: 10319: 5888:
Valueva, M.V.; Nagornov, N.N.; Lyakhov, P.A.; Valuev, G.V.; Chervyakov, N.I. (2020). "Application of the residue number system to reduce hardware costs of the convolutional neural network implementation".
4242:
Since feature map size decreases with depth, layers near the input layer tend to have fewer filters while higher layers can have more. To equalize computation at each layer, the product of feature values
10191:
Baccouche, Moez; Mamalet, Franck; Wolf, Christian; Garcia, Christophe; Baskurt, Atilla (2011-11-16). "Sequential Deep Learning for Human Action Recognition". In Salah, Albert Ali; Lepri, Bruno (eds.).
2763: 7850: 5653: 5612: 3685: 10086:
Szegedy, Christian; Liu, Wei; Jia, Yangqing; Sermanet, Pierre; Reed, Scott E.; Anguelov, Dragomir; Erhan, Dumitru; Vanhoucke, Vincent; Rabinovich, Andrew (2015). "Going deeper with convolutions".
1982:
Also, such network architecture does not take into account the spatial structure of data, treating input pixels which are far apart in the same way as pixels that are close together. This ignores
1380:
To speed processing, standard convolutional layers can be replaced by depthwise separable convolutional layers, which are based on a depthwise convolution followed by a pointwise convolution. The
10883: 10442:
Duan, Xuhuan; Wang, Le; Zhai, Changbo; Zheng, Nanning; Zhang, Qilin; Niu, Zhenxing; Hua, Gang (2018). "Joint Spatio-Temporal Action Localization in Untrimmed Videos with Per-Frame Segmentation".
5050:, CNNs can represent different contextual realities of language that do not rely on a series-sequence assumption, while RNNs are better suitable when classical time series modeling is required. 1986:
in data with a grid-topology (such as images), both computationally and semantically. Thus, full connectivity of neurons is wasteful for purposes such as image recognition that are dominated by
1790:. The results of each TDNN over the input signal were combined using max pooling and the outputs of the pooling layers were then passed on to networks performing the actual word classification. 1762:
numbers. However, the lack of an efficient training method to determine the kernel coefficients of the involved convolutions meant that all the coefficients had to be laboriously hand-designed.
10169: 8842:
Ma, Zhanyu; Chang, Dongliang; Xie, Jiyang; Ding, Yifeng; Wen, Shaoguo; Li, Xiaoxu; Si, Zhongwei; Guo, Jun (2019). "Fine-Grained Vehicle Classification With Channel Max Pooling Modified CNNs".
4637:
By avoiding training all nodes on all training data, dropout decreases overfitting. The method also significantly improves training speed. This makes the model combination practical, even for
2652:. This is known as down-sampling. It is common to periodically insert a pooling layer between successive convolutional layers (each one typically followed by an activation function, such as a 3356: 1488:
because a single bias and a single vector of weights are used across all receptive fields that share that filter, as opposed to each receptive field having its own bias and vector weighting.
4785:
Due to multiplicative interactions between weights and inputs this has the useful property of encouraging the network to use all of its inputs a little rather than some of its inputs a lot.
3106: 1875:
In 2004, it was shown by K. S. Oh and K. Jung that standard neural networks can be greatly accelerated on GPUs. Their implementation was 20 times faster than an equivalent implementation on
1676:. It was not used in his neocognitron since all the weights were nonnegative; lateral inhibition was used instead. The rectifier has become the most popular activation function for CNNs and 5352:. A general-purpose deep learning library for the JVM production stack running on a C++ scientific computing engine. Allows the creation of custom layers. Integrates with Hadoop and Kafka. 9804:
Hinton, Geoffrey E.; Srivastava, Nitish; Krizhevsky, Alex; Sutskever, Ilya; Salakhutdinov, Ruslan R. (2012). "Improving neural networks by preventing co-adaptation of feature detectors".
1117:(or cross-correlation) kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features. 6570: 11697:
Lee, Honglak; Grosse, Roger; Ranganath, Rajesh; Ng, Andrew Y. (1 January 2009). "Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations".
11458: 5876:
The notion of convolution or correlation used in the models presented is popular in engineering disciplines and has been applied extensively to designing filters, control systems, etc.
7499: 5310:. They provide a generic structure that can be used in many image and signal processing tasks. Benchmark results on standard image datasets like CIFAR have been obtained using CDBNs. 5213:-based measures are used in conjunction with geometric neural networks (GNNs), e.g. for period classification of those clay tablets being among the oldest documents of human history. 4761:
predictive power of the network directly, reducing the complexity of the function that it can perform on the data, and thus limits the amount of overfitting. This is equivalent to a "
3465: 4880: 3973: 3293: 2476: 1089:
network has been applied to process and make predictions from many different types of data including text, images and audio. Convolution-based networks are the de-facto standard in
10513:
Le, Q. V.; Zou, W. Y.; Yeung, S. Y.; Ng, A. Y. (2011-01-01). "Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis".
8488: 8416: 10359: 4137: 3267: 1860:
contrast to previous models, image-like outputs at the highest resolution were generated, e.g., for semantic segmentation, image reconstruction, and object localization tasks.
1423:
tiling sizes such as 2 × 2 are commonly used. Global pooling acts on all the neurons of the feature map. There are two common types of pooling in popular use: max and average.
1295:
and an output layer. In a convolutional neural network, the hidden layers include one or more layers that perform convolutions. Typically this includes a layer that performs a
3385: 2787:
Due to the effects of fast spatial reduction of the size of the representation, there is a recent trend towards using smaller filters or discarding pooling layers altogether.
11398:
Zang, Jinliang; Wang, Le; Liu, Ziyi; Zhang, Qilin; Hua, Gang; Zheng, Nanning (2018). "Attention-Based Temporal Weighted Convolutional Neural Network for Action Recognition".
3903: 10841:
Wallach, Izhar; Dzamba, Michael; Heifets, Abraham (2015-10-09). "AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery".
3493: 2590:, and the set of activation maps for each different filter are stacked together along the depth dimension to produce the output volume. Parameter sharing contributes to the 4698:, given by the activities within the pooling region. This approach is free of hyperparameters and can be combined with other regularization approaches, such as dropout and 4262:
Common filter sizes found in the literature vary greatly, and are usually chosen based on the data set. Typical filter sizes range from 1x1 to 7x7. As two famous examples,
3846: 9920: 4968:, CNNs achieved a large decrease in error rate. Another paper reported a 97.6% recognition rate on "5,600 still images of more than 10 subjects". CNNs were used to assess 2990: 1505:
An unpooling layer expands the layer. The max-unpooling layer is the simplest, as it simply copies each entry multiple times. For example, a 2-by-2 max-unpooling layer is
3781: 3241: 12575: 7579: 4832: 4320:
It is commonly assumed that CNNs are invariant to shifts of the input. Convolution or pooling layers within a CNN that do not have a stride greater than one are indeed
1412: 9477: 9295: 8239: 8141: 7524: 5945: 5861: 4805:
to enforce the constraint. In practice, this corresponds to performing the parameter update as normal, and then enforcing the constraint by clamping the weight vector
11277:
Hubert Mara and Bartosz Bogacz (2019), "Breaking the Code on Broken Tablets: The Learning Challenge for Annotated Cuneiform Script in Normalized 2D and 3D Datasets",
4234:
The stride is the number of pixels that the analysis window moves on each iteration. A stride of 2 means that each kernel is offset by 2 pixels from its predecessor.
2938: 7169: 5046:, search query retrieval, sentence modeling, classification, prediction and other traditional NLP tasks. Compared to traditional language processing methods such as 2533: 975: 10677:
Collobert, Ronan; Weston, Jason; Bottou, Leon; Karlen, Michael; Kavukcuoglu, Koray; Kuksa, Pavel (2011-03-02). "Natural Language Processing (almost) from Scratch".
8573:
Viebke, Andre; Memeti, Suejb; Pllana, Sabri; Abraham, Ajith (2019). "CHAOS: a parallelization scheme for training convolutional neural networks on Intel Xeon Phi".
4632: 4561: 3053: 4717:. Using stochastic pooling in a multilayer model gives an exponential number of deformations since the selections in higher layers are independent of those below. 2326: 2292: 1013: 10920: 10316: 6510: 4673: 4488: 3020: 2964: 2559: 5058:
A CNN with 1-D convolutions was used on time series in the frequency domain (spectral residual) by an unsupervised model to detect anomalies in the time domain.
3585: 6141: 4900: 4601: 4581: 4531: 4508: 3705: 2784:
pooling. Average pooling was often used historically but has recently fallen out of favor compared to max pooling, which generally performs better in practice.
2423: 2403: 2383: 2363: 8025: 4090: 1718:
in 1987. Their paper replaced multiplication with convolution in time, inherently providing shift invariance, motivated by and connecting more directly to the
11099:
Bai, Shaojie; Kolter, J. Zico; Koltun, Vladlen (2018-04-19). "An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling".
5935:
van den Oord, Aaron; Dieleman, Sander; Schrauwen, Benjamin (2013-01-01). Burges, C. J. C.; Bottou, L.; Welling, M.; Ghahramani, Z.; Weinberger, K. Q. (eds.).
2261:
controls how depth columns around the width and height are allocated. If the stride is 1, then we move the filters one pixel at a time. This leads to heavily
7744:
Lecun, Y.; Jackel, L. D.; Bottou, L.; Cortes, C.; Denker, J. S.; Drucker, H.; Guyon, I.; Muller, U. A.; Sackinger, E.; Simard, P.; Vapnik, V. (August 1995).
4226:
corresponding kernel dimension. For example, a convolutional layer using 3x3 kernels would receive a 2-pixel pad, that is 1 pixel on each side of the image.
5221:
For many applications, training data is not very available. Convolutional neural networks usually require a large amount of training data in order to avoid
9624:
Wieslander, Håkan; Harrison, Philip J.; Skogberg, Gabriel; Jackson, Sonya; Fridén, Markus; Karlsson, Johan; Spjuth, Ola; Wählby, Carolina (February 2021).
8351: 7829: 7720: 7659: 6204: 5282:. Unlike earlier reinforcement learning agents, DQNs that utilize CNNs can learn directly from high-dimensional sensory inputs via reinforcement learning. 4607:
of the output of any node is the same as in the training stages. This is the biggest contribution of the dropout method: although it effectively generates
970: 2578:
Since all neurons in a single depth slice share the same parameters, the forward pass in each depth slice of the convolutional layer can be computed as a
10802:
Time-Series Anomaly Detection Service at Microsoft | Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
8552: 4705:
An alternate view of stochastic pooling is that it is equivalent to standard max pooling but with many copies of an input image, each having small local
1715: 960: 10660: 11823: 3125: 1695:
algorithms have been proposed over the decades to train the weights of a neocognitron. Today, however, the CNN architecture is usually trained through
1573:
Deconvolution layers are used in image generators. By default, it creates periodic checkerboard artifact, which can be fixed by upscale-then-convolve.
6872:
Chen, Liang-Chieh; Papandreou, George; Schroff, Florian; Adam, Hartwig (2017-12-05). "Rethinking Atrous Convolution for Semantic Image Segmentation".
10161: 9603: 8821:
Springenberg, Jost Tobias; Dosovitskiy, Alexey; Brox, Thomas; Riedmiller, Martin (2014-12-21). "Striving for Simplicity: The All Convolutional Net".
7472: 6783: 5322:: A library for convolutional neural networks. Created by the Berkeley Vision and Learning Center (BVLC). It supports both CPU and GPU. Developed in 4980: 4958: 1911: 9132:
Zafar, Afia; Aamir, Muhammad; Mohd Nawi, Nazri; Arshad, Ali; Riaz, Saman; Alruban, Abdulrahman; Dutta, Ashit Kumar; Almotairi, Sultan (2022-08-29).
12417: 11141:
Borovykh, Anastasia; Bohte, Sander; Oosterlee, Cornelis W. (2018-09-17). "Conditional Time Series Forecasting with Convolutional Neural Networks".
9540: 9501: 5826: 4353:
The accuracy of the final model is based on a sub-part of the dataset set apart at the start, often called a test-set. Other times methods such as
4273:
The challenge is to find the right level of granularity so as to create abstractions at the proper scale, given a particular data set, and without
801: 10295:
Huang, Jie; Zhou, Wengang; Zhang, Qilin; Li, Houqiang; Li, Weiping (2018). "Video-based Sign Language Recognition without Temporal Segmentation".
1743:. Thus, while also using a pyramidal structure as in the neocognitron, it performed a global optimization of the weights instead of a local one. 1113:
neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded
10862:
Yosinski, Jason; Clune, Jeff; Nguyen, Anh; Fuchs, Thomas; Lipson, Hod (2015-06-22). "Understanding Neural Networks Through Deep Visualization".
5483:
When applied to other types of data than image data, such as sound data, "spatial position" may variously correspond to different points in the
4694:
pooling operations were replaced with a stochastic procedure, where the activation within each pooling region is picked randomly according to a
2659: 1609:. Provided the eyes are not moving, the region of visual space within which visual stimuli affect the firing of a single neuron is known as its 11204:
Zhao, Bendong; Lu, Huanzhang; Chen, Shangfeng; Liu, Junliang; Wu, Dongya (2017-02-01). "Convolutional neural networks for time series classi".
6539: 5822: 4919:. The pose relative to the retina is the relationship between the coordinate frame of the retina and the intrinsic features' coordinate frame. 2140:
A worked example of performing a convolution. The convolution has stride 1, zero-padding, with kernel size 3-by-3. The convolution kernel is a
1008: 11183:
Chen, Yitian; Kang, Yanfei; Chen, Yixiong; Wang, Zizhuo (2019-06-11). "Probabilistic Forecasting with Temporal Convolutional Neural Network".
6281: 4401: 4170: 2084: 49: 9243: 5761: 4687:
A major drawback to Dropout is that it does not have the same benefits for convolutional layers, where the neurons are not fully connected.
3303:(also known as RoI pooling) is a variant of max pooling, in which output size is fixed and input rectangle is a parameter. They are used in 1508: 10719:
Bai, S.; Kolter, J.S.; Koltun, V. (2018). "An empirical evaluation of generic convolutional and recurrent networks for sequence modeling".
10481:. Proceedings of the 11th European Conference on Computer Vision: Part VI. ECCV'10. Berlin, Heidelberg: Springer-Verlag. pp. 140–153. 9415: 8264:
Ciresan, Dan; Meier, Ueli; Gambardella, Luca; Schmidhuber, Jürgen (2010). "Deep big simple neural nets for handwritten digit recognition".
2828: 2182:
has been adapted for use in convolutional layers by using sparse patches with a high-mask ratio and a global response normalization layer.
1484:
of the input (e.g., a particular shape). A distinguishing feature of CNNs is that many neurons can share the same filter. This reduces the
965: 816: 96: 9697: 6744: 11052:
Maddison, Chris J.; Huang, Aja; Sutskever, Ilya; Silver, David (2014). "Move Evaluation in Go Using Deep Convolutional Neural Networks".
6715: 5229:. Furthermore, this technique allows convolutional network architectures to successfully be applied to problems with tiny training sets. 547: 68: 8377: 7552: 6058: 3975:. ReLU is often preferred to other functions because it trains the neural network several times faster without a significant penalty to 3791:
and in the overall network without affecting the receptive fields of the convolution layers. In 2011, Xavier Glorot, Antoine Bordes and
10564:
Grefenstette, Edward; Blunsom, Phil; de Freitas, Nando; Hermann, Karl Moritz (2014-04-29). "A Deep Architecture for Semantic Parsing".
8441: 3304: 2218:(along width and height), but always extend along the entire depth of the input volume. Such an architecture ensures that the learned ( 1337:
After passing through a convolutional layer, the image becomes abstracted to a feature map, also called an activation map, with shape:
1048: 851: 11673: 11402:. IFIP Advances in Information and Communication Technology. Vol. 519. Cham: Springer International Publishing. pp. 97–108. 2156:), which have a small receptive field, but extend through the full depth of the input volume. During the forward pass, each filter is 10800:
Ren, Hansheng; Xu, Bixiong; Wang, Yujing; Yi, Chao; Huang, Congrui; Kou, Xiaoyu; Xing, Tony; Yang, Mao; Tong, Jie; Zhang, Qi (2019).
4649:
DropConnect is the generalization of dropout in which each connection, rather than each output unit, can be dropped with probability
1657:
in 1980. It was inspired by the above-mentioned work of Hubel and Wiesel. The neocognitron introduced the two basic types of layers:
1097:
and image processing, and have only recently have been replaced -- in some cases -- by newer deep learning architectures such as the
11757: 10064: 4801:
Another form of regularization is to enforce an absolute upper bound on the magnitude of the weight vector for every neuron and use
2053:
learned, thus lowering the memory requirements for running the network and allowing the training of larger, more powerful networks.
75: 7873: 5168:
A couple of CNNs for choosing moves to try ("policy network") and evaluating positions ("value network") driving MCTS were used by
1714:
The term "convolution" first appears in neural networks in a paper by Toshiteru Homma, Les Atlas, and Robert Marks II at the first
1498:
A deconvolutional neural network is essentially the reverse of a CNN. It consists of deconvolutional layers and unpooling layers.
1098: 9942:
Lawrence, Steve; C. Lee Giles; Ah Chung Tsoi; Andrew D. Back (1997). "Face Recognition: A Convolutional Neural Network Approach".
8177: 5285:
Preliminary results were presented in 2014, with an accompanying paper in February 2015. The research described an application to
4641:. The technique seems to reduce node interactions, leading them to learn more robust features that better generalize to new data. 11933: 11613:
Sun, R.; Sessions, C. (June 2000). "Self-segmentation of sequences: automatic formation of hierarchies of sequential behaviors".
9378:. Communications in Computer and Information Science. Vol. 1342. Cham: Springer International Publishing. pp. 282–295. 8776: 7772: 7685:"Computerized detection of clustered microcalcifications in digital mammograms using a shift-invariant artificial neural network" 7602: 6244:. Communications in Computer and Information Science. Vol. 1342. Cham: Springer International Publishing. pp. 267–281. 5147:
published a paper showing that a CNN trained by supervised learning from a database of human professional games could outperform
4460:
Because a fully connected layer occupies most of the parameters, it is prone to overfitting. One method to reduce overfitting is
927: 9910: 8631: 4533:
is usually 0.5; for input nodes, it is typically much higher because information is directly lost when input nodes are ignored.
4217:
The kernel is the number of pixels processed together. It is typically expressed as the kernel's dimensions, e.g., 2x2, or 3x3.
11816: 11359:, in Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, vol., no., pp.1021–1025, 23–26 Aug. 2015 10615:
Kalchbrenner, Nal; Grefenstette, Edward; Blunsom, Phil (2014-04-08). "A Convolutional Neural Network for Modelling Sentences".
7421: 4773:
A simple form of added regularizer is weight decay, which simply adds an additional error, proportional to the sum of weights (
4254:
The number of feature maps directly controls the capacity and depends on the number of available examples and task complexity.
476: 10775:
Haotian, J.; Zhong, Li; Qianxiao, Li (2021). "Approximation Theory of Convolutional Architectures for Time Series Modelling".
8740: 7576: 6893:
Duta, Ionut Cosmin; Georgescu, Mariana Iuliana; Ionescu, Radu Tudor (2021-08-16). "Contextual Convolutional Neural Networks".
4464:, introduced in 2014. At each training stage, individual nodes are either "dropped out" of the net (ignored) with probability 82: 12735: 11425: 11294: 10540: 10486: 10459: 10218: 10113: 9880:
Rock, Irvin. "The frame of reference." The legacy of Solomon Asch: Essays in cognition and social psychology (1990): 243–268.
9750:
Zeiler, Matthew D.; Fergus, Rob (2013-01-15). "Stochastic Pooling for Regularization of Deep Convolutional Neural Networks".
9401: 9346: 9116: 8991: 8938: 8906: 8715: 8538: 8474: 8325: 7935: 7766: 7163: 6962: 6819: 6472:"Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position" 6267: 6103: 6036: 5986: 5647: 5606: 1464:
dilated convolutional layer can comprise filters with multiple dilation ratios, thus having a variable receptive field size.
1311:, fully connected layers, and normalization layers. Here it should be noted how close a convolutional neural network is to a 9893: 9462: 9280: 8202: 8130: 7521: 6679: 5936: 5846: 3590: 2431: 12606: 9833: 9560:"Inductive conformal predictor for convolutional neural networks: Applications to active learning for image classification" 7949: 5711: 4325: 1439:
Fully connected layers connect every neuron in one layer to every neuron in another layer. It is the same as a traditional
985: 748: 283: 11352: 10234:
Ji, Shuiwang; Xu, Wei; Yang, Ming; Yu, Kai (2013-01-01). "3D Convolutional Neural Networks for Human Action Recognition".
8055: 6778:
Ciresan, Dan; Meier, Ueli; Schmidhuber, Jürgen (June 2012). "Multi-column deep neural networks for image classification".
12745: 12707: 12258: 11995: 9729: 6607: 2582:
of the neuron's weights with the input volume. Therefore, it is common to refer to the sets of weights as a filter (or a
1003: 64: 7897: 7575:
Denker, J S, Gardner, W R, Graf, H. P, Henderson, D, Howard, R E, Hubbard, W, Jackel, L D, BaIrd, H S, and Guyon (1989)
7551:. The 42nd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017). New Orleans, LA, US. 6471: 6304:"Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening" 3795:
found that ReLU enables better training of deeper networks, compared to widely used activation functions prior to 2011.
2328:
is rare. A greater stride means smaller overlap of receptive fields and smaller spatial dimensions of the output volume.
1747:
phoneme recognition system was invariant to both time and frequency shifts, as with images processed by a neocognitron.
1706:
is the first ANN which requires units located at multiple network positions to have shared weights, a hallmark of CNNs.
1456:. Typically the area is a square (e.g. 5 by 5 neurons). Whereas, in a fully connected layer, the receptive field is the 11015: 10912: 7822:"Error Back Propagation with Minimum-Entropy Weights: A Technique for Better Generalization of 2-D Shift-Invariant NNs" 1964:
channels has 3 million weights per fully-connected neuron, which is too high to feasibly process efficiently at scale.
836: 811: 760: 11317:
Bogacz, Bartosz; Mara, Hubert (2020), "Period Classification of 3D Cuneiform Tablets with Geometric Neural Networks",
4915:
frames and use a group of neurons to represent a conjunction of the shape of the feature and its pose relative to the
4802: 3320: 12519: 12146: 11953: 11809: 11724: 10698:
Yin, W; Kann, K; Yu, M; Schütze, H (2017-03-02). "Comparative study of CNN and RNN for natural language processing".
8682: 8225: 7347: 6133: 5798: 5454: 5303: 4423: 4206: 4192: 4017: 3058: 2235: 2106: 1823:'s check reading systems, and fielded in several American banks since June 1996, reading millions of checks per day. 884: 879: 532: 115: 10492: 8009: 7890: 5074:
can identify potential treatments. In 2015, Atomwise introduced AtomNet, the first deep learning neural network for
12474: 11073: 10941:
Chellapilla, K; Fogel, DB (1999). "Evolving neural networks to play checkers without relying on expert knowledge".
6540:"Subject independent facial expression recognition with robust face detection using a convolutional neural network" 5127:) was tested on 165 games against players and ranked in the highest 0.4%. It also earned a win against the program 4289: 2629: 1641:
Hubel and Wiesel also proposed a cascading model of these two types of cells for use in pattern recognition tasks.
542: 180: 10976:
Chellapilla, K.; Fogel, D.B. (2001). "Evolving an expert checkers playing program without using human expertise".
5791:
Guide to convolutional neural networks : a practical application to traffic-sign detection and classification
4583:. However, we can find an approximation by using the full network with each node's output weighted by a factor of 2148:
The convolutional layer is the core building block of a CNN. The layer's parameters consist of a set of learnable
1292: 10336:
Simonyan, Karen; Zisserman, Andrew (2014). "Two-Stream Convolutional Networks for Action Recognition in Videos".
8347: 2203:
pattern between neurons of adjacent layers: each neuron is connected to only a small region of the input volume.
1819:
It was superior than other commercial courtesy amount reading systems (as of 1995). The system was integrated in
8893:. Lecture Notes in Computer Science. Vol. 8818. Cham: Springer International Publishing. pp. 364–375. 8513: 7821: 7684: 7623: 6169:"Parallel distributed processing model with local space-invariant interconnections and its optical architecture" 6168: 1109:
in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for
12661: 12601: 12199: 11333:
Presentation of the ICFHR paper on Period Classification of 3D Cuneiform Tablets with Geometric Neural Networks
6919: 5732:
Chervyakov, N.I.; Lyakhov, P.A.; Deryabin, M.A.; Nagornov, N.N.; Valueva, M.V.; Valuev, G.V. (September 2020).
5365: 4405: 4174: 2088: 1041: 937: 701: 522: 53: 4634:
neural nets, and as such allows for model combination, at test time only a single network needs to be tested.
3411: 1843:
convolved signals via de-convolution. This design was modified in 1989 to other de-convolution-based designs.
1839:
A different convolution-based design was proposed in 1988 for application to decomposition of one-dimensional
12194: 11883: 10657: 8624:
NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1
6012:"Deep Learning Techniques to Improve Intraoperative Awareness Detection from Electroencephalographic Signals" 5327: 4837: 4354: 4097: 3784: 1719: 912: 614: 390: 9854:
Hinton, Geoffrey (1979). "Some demonstrations of the effects of structural descriptions in mental imagery".
9626:"Deep Learning With Conformal Prediction for Hierarchical Analysis of Large-Scale Whole-Slide Tissue Images" 6672:
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence-Volume Volume Two
5022:
schemes for training spatio-temporal features have been introduced, based on Convolutional Gated Restricted
4536:
At testing time after training has finished, we would ideally like to find a sample average of all possible
3911: 3272: 2769:
is over 4 numbers. The depth dimension remains unchanged (this is true for other forms of pooling as well).
2168:
of that filter. As a result, the network learns filters that activate when it detects some specific type of
1993:
Convolutional neural networks are variants of multilayer perceptrons, designed to emulate the behavior of a
1259:. This independence from prior knowledge and human intervention in feature extraction is a major advantage. 1244:. The receptive fields of different neurons partially overlap such that they cover the entire visual field. 12033: 11990: 11943: 11938: 11162:
Mittelman, Roni (2015-08-03). "Time-series modeling with undecimated fully convolutional neural networks".
10587:"Learning Semantic Representations Using Convolutional Neural Networks for Web Search – Microsoft Research" 9776:"Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis – Microsoft Research" 8212:. ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning. pp. 873–880. 7480:
IEEE Transactions on Acoustics, Speech, and Signal Processing, Volume 37, No. 3, pp. 328. - 339 March 1989.
5429: 5345: 4436: 4373: 3405: 2136: 1868:
Although CNNs were invented in the 1980s, their breakthrough in the 2000s required fast implementations on
1160: 1070: 869: 806: 694: 537: 527: 9559: 7469: 7305:
Ramachandran, Prajit; Barret, Zoph; Quoc, V. Le (October 16, 2017). "Searching for Activation Functions".
7251:
Fukushima, K. (1969). "Visual feature extraction by a multilayered network of analog threshold elements".
6017:
2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)
5847:"An Artificial Neural Network for Spatio-Temporal Bipolar Patterns: Application to Phoneme Classification" 5637: 5596: 5261:, the most critical spatial regions/temporal instants could be visualized to justify the CNN predictions. 89: 12687: 11983: 11909: 5444: 5361: 5341: 5206: 5075: 5039: 3723: 2207: 1669: 1627:, whose output is maximized by straight edges having particular orientations within their receptive field 1304: 1197: 1154: 1074: 1020: 932: 917: 378: 200: 7454:. Meeting of the Institute of Electrical, Information and Communication Engineers (IEICE). Tokyo, Japan. 5969:
Collobert, Ronan; Weston, Jason (2008-01-01). "A unified architecture for natural language processing".
4107: 3246: 2013:" produce the strongest response to a spatially local input pattern. Stacking many such layers leads to 1443:
neural network (MLP). The flattened matrix goes through a fully connected layer to classify the images.
12311: 12246: 11847: 11796:
vdumoulin/conv_arithmetic: A technical report on convolution arithmetic in the context of deep learning
11337: 10088:
IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015
8889:. In Miao, Duoqian; Pedrycz, Witold; Ślȩzak, Dominik; Peters, Georg; Hu, Qinghua; Wang, Ruizhi (eds.). 5734:"Residue Number System-Based Solution for Reducing the Hardware Cost of a Convolutional Neural Network" 5416: 4790: 3783:. It effectively removes negative values from an activation map by setting them to zero. It introduces 3395: 2169: 2149: 2041:
to the features contained therein, allowing the CNN to be more robust to variations in their positions.
2010: 1481: 1355: 1252: 1102: 1082: 980: 907: 657: 552: 340: 273: 233: 11369: 9681:
Srivastava, Nitish; C. Geoffrey Hinton; Alex Krizhevsky; Ilya Sutskever; Ruslan Salakhutdinov (2014).
9319:
Ribeiro, Antonio H.; Schön, Thomas B. (2021). "How Convolutional Neural Networks Deal with Aliasing".
6749:. First International Conference on Spoken Language Processing (ICSLP 90). Kobe, Japan. Archived from 5018:
units are typically incorporated after the CNN to account for inter-frame or inter-clip dependencies.
3361: 2332:
exactly preserve the spatial size of the input volume, this is commonly referred to as "same" padding.
1956:(MLP) models were used for image recognition. However, the full connectivity between nodes caused the 12712: 12570: 12209: 12040: 11863: 11120:
Yu, Fisher; Koltun, Vladlen (2016-04-30). "Multi-Scale Context Aggregation by Dilated Convolutions".
7542:
Ko, Tom; Peddinti, Vijayaditya; Povey, Daniel; Seltzer, Michael L.; Khudanpur, Sanjeev (March 2018).
7043:"Comparing Object Recognition in Humans and Deep Convolutional Neural Networks—An Eye Tracking Study" 6851:
Yu, Fisher; Koltun, Vladlen (2016-04-30). "Multi-Scale Context Aggregation by Dilated Convolutions".
5459: 5258: 5128: 5047: 4965: 4945: 4461: 3989: 3976: 3851: 2583: 2485:, then the strides are incorrect and the neurons cannot be tiled to fit across the input volume in a 2262: 2153: 2141: 1731: 1324: 1034: 640: 408: 278: 11707: 11031:
Clark, Christopher; Storkey, Amos (2014). "Teaching Deep Convolutional Neural Networks to Play Go".
10658:
A unified architecture for natural language processing: Deep neural networks with multitask learning
10523: 10248: 10201: 9321:
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
6802: 3470: 2336: 12740: 12611: 11868: 10360:"Segment-Tube: Spatio-Temporal Action Localization in Untrimmed Videos with Per-Frame Segmentation" 9956: 9371: 8090: 7871:
Decomposition of surface EMG signals into single fiber action potentials by means of neural network
7392: 6237: 5412: 5404: 5254: 5015: 4695: 4296: 3804: 2219: 2179: 2026:
under shifts of the locations of input features in the visual field, i.e. they grant translational
1869: 1809: 662: 582: 505: 423: 253: 215: 210: 170: 165: 11627: 7543: 5923:
Convolutional neural networks are a promising tool for solving the problem of pattern recognition.
5676:"Powder-Bed Fusion Process Monitoring by Machine Vision With Hybrid Convolutional Neural Networks" 5674:
Zhang, Yingjie; Soon, Hong Geok; Ye, Dongsen; Fuh, Jerry Ying Hsi; Zhu, Kunpeng (September 2020).
4205:
Hyperparameters are various settings that are used to control the learning process. CNNs use more
2969: 1960:, and was computationally intractable with higher-resolution images. A 1000×1000-pixel image with 12656: 12641: 12294: 12289: 12189: 12057: 11838: 10195:. Lecture Notes in Computer Science. Vol. 7065. Springer Berlin Heidelberg. pp. 29–39. 9252: 8886: 6637:
Chollet, François (2017-04-04). "Xception: Deep Learning with Depthwise Separable Convolutions".
5733: 5388:: The reference deep-learning library for Python with an API largely compatible with the popular 5274:
A deep Q-network (DQN) is a type of deep learning model that combines a deep neural network with
5250: 5162: 5152: 5001:
further 20 million images without faces. They used batches of 128 images over 50,000 iterations.
4691: 4394: 4163: 3220: 2077: 2002: 1957: 1341: 1331: 1300: 1190:
responses known as feature maps. Counter-intuitively, most convolutional neural networks are not
1148: 609: 458: 358: 185: 42: 11562:
Mnih, Volodymyr; et al. (2015). "Human-level control through deep reinforcement learning".
11319:
Proceedings of the 17th International Conference on Frontiers of Handwriting Recognition (ICFHR)
8931:
Proceedings of the 27th International Conference on International Conference on Machine Learning
7041:
van Dyck, Leonard Elia; Kwitt, Roland; Denzler, Sebastian Jochen; Gruber, Walter Roland (2021).
5118:
and Chellapilla published papers showing how a convolutional neural network could learn to play
4808: 3736: 1384:
is a spatial convolution applied independently over each channel of the input tensor, while the
12616: 12376: 12095: 12090: 11702: 11622: 11457:
Wang, Le; Zang, Jinliang; Zhang, Qilin; Niu, Zhenxing; Hua, Gang; Zheng, Nanning (2018-06-21).
10518: 10243: 10196: 9951: 9682: 8085: 8076:
Hinton, GE; Osindero, S; Teh, YW (Jul 2006). "A fast learning algorithm for deep belief nets".
7387: 6797: 6750: 5464: 5379: 5279: 5242: 5011: 4973: 4342: 2591: 2038: 1391: 789: 765: 667: 428: 403: 363: 175: 7153: 3317:
is a specific form of average pooling, where the entire channel is averaged. That is, it maps
12646: 12631: 12596: 12284: 12184: 12052: 11279:
Proceedings of the 15th International Conference on Document Analysis and Recognition (ICDAR)
10358:
Wang, Le; Duan, Xuhuan; Zhang, Qilin; Niu, Zhenxing; Hua, Gang; Zheng, Nanning (2018-05-22).
9495: 9267:
Rectifier and softplus activation functions. The second one is a smooth version of the first.
6015: 5083: 5019: 3997: 3993: 2917: 2640:
the input image into a set of rectangles and, for each such sub-region, outputs the maximum.
1983: 1953: 1692: 1440: 1364: 1286: 1255:(or kernels) through automated learning, whereas in traditional algorithms these filters are 1205: 1201: 1191: 743: 565: 517: 373: 288: 160: 12514: 11259:(in German), heiDATA – institutional repository for research data of Heidelberg University, 2022:
response field. Replicating units in this way allows for the resulting activation map to be
12666: 12621: 12067: 12012: 11858: 11853: 11666: 11571: 11473: 10740:
Gruber, N. (2021). "Detecting dynamics of action in text with a recurrent neural network".
10374: 9571: 7982: 7696: 7635: 7206: 6708: 6379: 6315: 6180: 5550: 5408: 5375: 5157: 4993: 4610: 4539: 4361: 4334: 3031: 2794:
RoI pooling to size 2x2. In this example region proposal (an input parameter) has size 7x5.
2612: 2492: 1248: 1217: 1136: 672: 622: 20: 6663:
Ciresan, Dan; Ueli Meier; Jonathan Masci; Luca M. Gambardella; Jurgen Schmidhuber (2011).
6594:
Convolutional Neural Networks Demystified: A Matched Filtering Perspective Based Tutorial
4902:
are order of 3–4. Some papers report improvements when using this form of regularization.
4288:
is typically used, often with a 2x2 dimension. This implies that the input is drastically
8: 12241: 12219: 11968: 11963: 11921: 11873: 11789: 11749: 11459:"Action Recognition by an Attention-Aware Temporal Weighted Convolutional Neural Network" 11257:
HeiCuBeDa Hilprecht – Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection
9680: 8663:"Why do deep convolutional networks generalize so poorly to small image transformations?" 7281: 6743:
Yamaguchi, Kouichi; Sakamoto, Kenji; Akabane, Toshio; Fujimoto, Yoshiji (November 1990).
5492: 5358:: A toolkit for making real world machine learning and data analysis applications in C++. 5307: 5298: 5027: 4710: 4652: 4638: 4467: 3731: 2999: 2943: 2587: 2538: 2305: 2268: 2165: 1892: 1688: 1673: 1256: 1078: 775: 711: 682: 587: 413: 346: 332: 318: 293: 243: 195: 155: 11575: 11541:
Ong, Hao Yi; Chavez, Kevin; Hong, Augustus (2015-08-18). "Distributed Deep Q-Learning".
11477: 10378: 9575: 8766:"Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition" 7986: 7870: 7700: 7639: 7598:
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel,
7446: 7210: 6383: 6319: 6184: 5554: 4324:
to translations of the input. However, layers with a stride greater than one ignore the
3498: 12626: 12204: 11730: 11595: 11542: 11504: 11439: 11403: 11300: 11232: 11184: 11163: 11142: 11121: 11100: 11053: 11032: 10863: 10842: 10823: 10805: 10780: 10757: 10720: 10699: 10678: 10637: 10616: 10565: 10546: 10405: 10337: 10296: 10277: 10141: 10091: 10027: 9977: 9805: 9751: 9663: 9595: 9534: 9407: 9379: 9352: 9324: 9180: 9094: 9041: 9008: 8959: 8867: 8822: 8801: 8600: 8582: 8544: 8480: 8452: 8408: 8299: 8273: 8231: 8166: 8111: 7941: 7413: 7353: 7306: 7285: 7230: 7129: 7104: 7077: 7042: 6983: 6894: 6873: 6852: 6833: 6787: 6638: 6502: 6444: 6411: 6338: 6303: 6273: 6245: 6109: 6050: 5992: 5914: 5816: 5753: 5703: 5099: 4885: 4706: 4586: 4566: 4516: 4493: 4093: 3799: 3727: 3708: 3690: 3298: 2637: 2408: 2388: 2368: 2348: 2239: 1654: 1142: 1130: 753: 677: 463: 258: 10060: 9867: 8765: 7745: 7599: 7191: 6558: 5042:. CNN models are effective for various NLP problems and achieved excellent results in 4063: 3798:
Other functions can also be used to increase nonlinearity, for example the saturating
12692: 12680: 12484: 12136: 12007: 12000: 11720: 11648: 11640: 11599: 11587: 11509: 11491: 11431: 11421: 11304: 11290: 11011: 10958: 10827: 10761: 10636:
Kim, Yoon (2014-08-25). "Convolutional Neural Networks for Sentence Classification".
10536: 10482: 10455: 10410: 10392: 10269: 10261: 10214: 10136:; Khosla, Aditya; Bernstein, Michael; Berg, Alexander C.; Fei-Fei, Li (2014). "Image 10132:; Deng, Jia; Su, Hao; Krause, Jonathan; Satheesh, Sanjeev; Ma, Sean; Huang, Zhiheng; 10109: 10031: 10019: 9969: 9667: 9655: 9647: 9599: 9587: 9555: 9522: 9443: 9411: 9397: 9356: 9342: 9338: 9155: 9112: 9067: 9059: 8987: 8986:. Cambridge New York Port Melbourne New Delhi Singapore: Cambridge University Press. 8934: 8926: 8902: 8859: 8711: 8674: 8534: 8484: 8470: 8412: 8400: 8291: 8221: 8103: 7931: 7762: 7747:
Learning algorithms for classification: A comparison on handwritten digit recognition
7712: 7651: 7405: 7343: 7222: 7159: 7134: 7082: 7064: 7023: 6958: 6825: 6815: 6562: 6506: 6494: 6449: 6431: 6343: 6277: 6263: 6196: 6099: 6054: 6042: 6032: 5982: 5918: 5906: 5804: 5794: 5757: 5707: 5695: 5643: 5602: 5574: 5566: 5385: 5226: 5165:
program Fuego simulating ten thousand playouts (about a million positions) per move.
5023: 4941: 4726: 4699: 4690:
Even before Dropout, in 2013 a technique called stochastic pooling, the conventional
4675:. Each unit thus receives input from a random subset of units in the previous layer. 4440: 4338: 3788: 1299:
of the convolution kernel with the layer's input matrix. This product is usually the
1225: 846: 689: 602: 398: 368: 313: 308: 263: 205: 11734: 11350:"CNN based common approach to handwritten character recognition of multiple scripts" 10585:
Mesnil, Gregoire; Deng, Li; Gao, Jianfeng; He, Xiaodong; Shen, Yelong (April 2014).
9223: 9206: 8871: 8604: 8548: 7375: 2575:, the neurons in each depth slice are constrained to use the same weights and bias. 2253:
dimension may activate in the presence of various oriented edges, or blobs of color.
12437: 12427: 12234: 12028: 11978: 11973: 11916: 11904: 11712: 11632: 11579: 11499: 11481: 11443: 11413: 11282: 11260: 11213: 10985: 10950: 10815: 10749: 10550: 10528: 10447: 10446:. 25th IEEE International Conference on Image Processing (ICIP). pp. 918–922. 10400: 10382: 10281: 10253: 10206: 10129: 10101: 10011: 9981: 9961: 9915: 9863: 9637: 9579: 9389: 9334: 9218: 9145: 9104: 9051: 8979: 8894: 8851: 8732: 8592: 8530: 8526: 8462: 8392: 8303: 8283: 8213: 8115: 8095: 8017: 7990: 7945: 7923: 7802: 7754: 7704: 7643: 7491: 7417: 7397: 7357: 7335: 7260: 7234: 7214: 7124: 7120: 7116: 7072: 7054: 7015: 6950: 6837: 6807: 6665:"Flexible, High Performance Convolutional Neural Networks for Image Classification" 6554: 6486: 6439: 6427: 6423: 6387: 6333: 6323: 6255: 6188: 6113: 6091: 6028: 6024: 5996: 5974: 5898: 5745: 5687: 5558: 5514: 5488: 5319: 5246: 5043: 4050: 4042: 3906: 3308: 3119: 2645: 2265:
receptive fields between the columns, and to large output volumes. For any integer
2215: 2014: 1987: 1884: 1840: 1485: 1267: 874: 627: 577: 487: 471: 441: 303: 298: 248: 238: 136: 10667:."Proceedings of the 25th international conference on Machine learning. ACM, 2008. 8317: 8235: 6662: 6010:
Avilov, Oleksii; Rimbert, Sebastien; Popov, Anton; Bougrain, Laurent (July 2020).
1427:
uses the maximum value of each local cluster of neurons in the feature map, while
12550: 12494: 12316: 11958: 11878: 11785: 11356: 10664: 10323: 10133: 9897: 9890: 9825: 9722:"Regularization of Neural Networks using DropConnect | ICML 2013 | JMLR W&CP" 9583: 9393: 9108: 8898: 8619: 7994: 7914: 7901: 7877: 7857: 7606: 7583: 7528: 7476: 6704: 6664: 6328: 6259: 6134:"Shift-invariant pattern recognition neural network and its optical architecture" 5902: 5749: 5238: 4321: 4001: 3210:{\displaystyle {\frac {\sum _{i}e^{\beta a_{i}}a_{i}}{\sum _{i}e^{\beta a_{i}}}}} 2211: 2127:
Neurons of a convolutional layer (blue), connected to their receptive field (red)
2046: 2027: 2023: 1961: 1921:
A very deep CNN with over 100 layers by Microsoft won the ImageNet 2015 contest.
1774: 1740: 1696: 1634: 1610: 1581:
CNN are often compared to the way the brain achieves vision processing in living
1374: 1349: 1241: 1233: 1187: 1124: 1106: 1094: 902: 706: 572: 512: 11417: 10210: 9997:"A Convolutional Neural Network Approach for Objective Video Quality Assessment" 8165:
Ranzato, MarcAurelio; Poultney, Christopher; Chopra, Sumit; LeCun, Yann (2007).
8099: 7545:
A Study on Data Augmentation of Reverberant Speech for Robust Speech Recognition
7401: 7003: 12524: 12489: 12479: 12304: 12062: 11888: 11795: 11349: 11003: 10753: 9941: 9086: 9085:
Zhai, Xiaohua; Kolesnikov, Alexander; Houlsby, Neil; Beyer, Lucas (June 2022).
9055: 9030:"Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition" 9029: 8047: 8014:
12th International Conference on Document Analysis and Recognition (ICDAR 2005)
7880:". Proc. IEEE International Symp. on Circuits and Systems, pp. 1008–1011, 1989. 6942: 6011: 5971:
Proceedings of the 25th international conference on Machine learning - ICML '08
5881: 5675: 5398: 5337: 5115: 5067: 4972:
in an objective way after manual training; the resulting system had a very low
4949: 4746: 4714: 4604: 3109: 2777: 2200: 2050: 1934: 1598: 1594: 1340:(number of inputs) × (feature map height) × (feature map width) × (feature map 1312: 1213:
characterize a given dataset rather than the biases of a poorly-populated set.
922: 453: 190: 11331: 11231:
Petneházi, Gábor (2019-08-21). "QCNN: Quantile Convolutional Neural Network".
10532: 10475:
Taylor, Graham W.; Fergus, Rob; LeCun, Yann; Bregler, Christoph (2010-01-01).
10451: 10105: 9994: 9721: 9642: 9625: 9526: 9461:
Jadeberg, Simonyan, Zisserman, Kavukcuoglu, Max, Karen, Andrew, Koray (2015).
9447: 8596: 7790: 7327: 7059: 6954: 6811: 6615: 6537: 6392: 6367: 5378:-licensed Theano-like library with support for CPU, GPU, Google's proprietary 1637:, whose output is insensitive to the exact position of the edges in the field. 12729: 12469: 12449: 12366: 12045: 11644: 11495: 11435: 11286: 11217: 10396: 10265: 9651: 9591: 9239: 9207:"Appropriate number and allocation of ReLUs in convolutional neural networks" 9159: 9063: 8863: 8855: 8678: 8404: 7894: 7339: 7264: 7068: 7027: 6829: 6435: 5910: 5808: 5699: 5691: 5570: 5439: 5087: 4969: 4924: 4053: 4035: 4030:
penalizes the deviation between the predicted output of the network, and the
4023: 4013: 3792: 2607: 2345:
The spatial size of the output volume is a function of the input volume size
2238:
control the size of the output volume of the convolutional layer: the depth,
2226:) filters produce the strongest response to a spatially local input pattern. 2034: 1994: 1851: 1677: 1602: 1308: 1229: 1090: 1086: 841: 770: 652: 383: 268: 11716: 10819: 10015: 8217: 7532:, Advances in Neural Information Processing Systems, 1990, Morgan Kaufmann. 5978: 5639:
Recent Trends and Advances in Artificial Intelligence and Internet of Things
5538: 2628:
Another important concept of CNNs is pooling, which is a form of non-linear
1765:
Following the advances in the training of 1-D CNNs by Waibel et al. (1987),
12555: 12386: 11801: 11699:
Proceedings of the 26th Annual International Conference on Machine Learning
11652: 11591: 11513: 11264: 10962: 10517:. CVPR '11. Washington, DC, US: IEEE Computer Society. pp. 3361–3368. 10414: 10326:." IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014. 10273: 10023: 9973: 9659: 9437: 9071: 8956:
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
8295: 8210:
Proceedings of the 26th Annual International Conference on Machine Learning
8107: 8021: 7655: 7624:"Image processing of human corneal endothelium based on a learning network" 7409: 7226: 7138: 7086: 7019: 6943:"Adaptive deconvolutional networks for mid and high level feature learning" 6566: 6347: 6200: 6046: 5636:
Balas, Valentina E.; Kumar, Raghvendra; Srivastava, Rajshree (2019-11-19).
5578: 5449: 5349: 5144: 4563:
dropped-out networks; unfortunately this is unfeasible for large values of
4031: 2940:. If all activations are non-negative, then average pooling is the case of 2772:
In addition to max pooling, pooling units can use other functions, such as
2586:), which is convolved with the input. The result of this convolution is an 2425:
on the border. The number of neurons that "fit" in a given volume is then:
1703: 1650: 1630: 1614: 1606: 1237: 1186:
kernels or filters that slide along input features and provide translation-
10257: 9460: 9091:
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
8850:(4). Institute of Electrical and Electronics Engineers (IEEE): 3224–3233. 8662: 7743: 7716: 6498: 6453: 2045:
Together, these properties allow CNNs to achieve better generalization on
1753: 12651: 12422: 12331: 12326: 11948: 11926: 11252: 11077: 10476: 8466: 8287: 8167:"Efficient Learning of Sparse Representations with an Energy-Based Model" 8129:
Bengio, Yoshua; Lamblin, Pascal; Popovici, Dan; Larochelle, Hugo (2007).
7647: 7464: 6192: 5484: 5434: 5222: 5190: 5140: 5095: 5091: 5079: 4444: 4285: 4274: 4267: 4101: 2649: 2579: 2161: 2157: 1937: 1833: 1758:
Denker et al. (1989) designed a 2-D CNN system to recognize hand-written
1735: 1624: 1296: 1209: 1183: 1167: 647: 141: 11636: 11615:
IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics
11583: 10989: 9237: 9150: 9133: 8048:"High Performance Convolutional Neural Networks for Document Processing" 7218: 6608:"Convolutional Neural Networks (LeNet) – DeepLearning 0.1 documentation" 6138:
Proceedings of Annual Conference of the Japan Society of Applied Physics
6095: 5562: 5411:
framework with wide support for machine learning algorithms, written in
5205:
providing almost 2000 normalized 2-D and 3-D datasets prepared with the
12545: 12504: 12499: 12412: 12321: 12229: 12141: 12121: 10061:"ImageNet Large Scale Visual Recognition Competition 2014 (ILSVRC2014)" 9911:"The inside story of how AI got good enough to dominate Silicon Valley" 9516: 8045: 6538:
Matusugu, Masakazu; Katsuhiko Mori; Yusuke Mitari; Yuji Kaneda (2003).
6490: 6412:"Receptive fields and functional architecture of monkey striate cortex" 5371: 5286: 5275: 5198: 4408: in this section. Unsourced material may be challenged and removed. 4177: in this section. Unsourced material may be challenged and removed. 2790: 2091: in this section. Unsourced material may be challenged and removed. 1805: 1766: 1221: 796: 492: 418: 11486: 10954: 10387: 9965: 8514:"The Potential of the Intel (R) Xeon Phi for Supervised Deep Learning" 8449:
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
7328:"Learning recognition and segmentation of 3-D objects from 2-D images" 1620:
Their 1968 paper identified two basic visual cell types in the brain:
1563:{\displaystyle \mapsto {\begin{bmatrix}x&x\\x&x\end{bmatrix}}} 12540: 12509: 12407: 12251: 12214: 12151: 12105: 12100: 12085: 11547: 8820: 8376:
Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E. (2017-05-24).
8007: 7893:." In Proc. 28th IEEE Decision and Control Conf., pp. 266–271, 1989. 7860:". In Proc. 27th IEEE Decision and Control Conf., pp. 343–347, 1988. 7806: 7708: 6595: 5210: 5194: 5124: 4989: 4985: 4762: 4300: 2907:{\displaystyle \left({\frac {1}{N}}\sum _{i}|a_{i}|^{p}\right)^{1/p}} 2632:. There are several non-linear functions to implement pooling, where 1820: 1582: 955: 736: 11391: 10563: 9995:
Le Callet, Patrick; Christian Viard-Gaudin; Dominique Barba (2006).
9278: 9028:
He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing; Sun, Jian (2015-09-01).
9007:
Lin, Min; Chen, Qiang; Yan, Shuicheng (2013). "Network In Network".
8835: 8773:
Artificial Neural Networks (ICANN), 20th International Conference on
8396: 8052:
Tenth International Workshop on Frontiers in Handwriting Recognition
7869:
Daniel Graupe, Boris Vern, G. Gruener, Aaron Field, and Qiu Huang. "
7826:
Proceedings of the International Joint Conference on Neural Networks
7151: 6941:
Zeiler, Matthew D.; Taylor, Graham W.; Fergus, Rob (November 2011).
6915: 4383: 4152: 2648:
and amount of computation in the network, and hence to also control
2164:
between the filter entries and the input, producing a 2-dimensional
2066: 2001:
3D volumes of neurons. The layers of a CNN have neurons arranged in
31: 12442: 12274: 11408: 11237: 11189: 11168: 11147: 11126: 11105: 10868: 10847: 10810: 10785: 10725: 10704: 10317:
Large-scale video classification with convolutional neural networks
10301: 9826:"Dropout: A Simple Way to Prevent Neural Networks from Overfitting" 9803: 9683:"Dropout: A Simple Way to Prevent Neural Networks from overfitting" 9384: 9329: 9185: 9174: 9134:"A Comparison of Pooling Methods for Convolutional Neural Networks" 9099: 8978:
Zhang, Aston; Lipton, Zachary; Li, Mu; Smola, Alexander J. (2024).
8587: 8457: 7927: 7311: 7290: 6988: 6899: 6878: 6857: 6643: 6250: 5111: 4329: 4027: 2486: 1976: 1787: 1759: 1750:
TDNNs improved the performance of far-distance speech recognition.
11058: 11037: 10884:"Toronto startup has a faster way to discover effective medicines" 10683: 10642: 10621: 10570: 10444:
2018 25th IEEE International Conference on Image Processing (ICIP)
10342: 10146: 10096: 9810: 9756: 9370:
Myburgh, Johannes C.; Mouton, Coenraad; Davel, Marelie H. (2020).
9046: 9013: 8964: 8827: 8806: 8708:
Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow
8278: 8263: 8203:"Large-scale deep unsupervised learning using graphics processors" 6792: 6746:
A Neural Network for Speaker-Independent Isolated Word Recognition
6236:
Mouton, Coenraad; Myburgh, Johannes C.; Davel, Marelie H. (2020).
6084: 5598:
Convolutional Neural Networks in Visual Computing: A Concise Guide
5026:
and Independent Subspace Analysis. Its application can be seen in
1605:
contain neurons that individually respond to small regions of the
1200:
are usually fully connected networks, that is, each neuron in one
12565: 12402: 12356: 12279: 12179: 12174: 12126: 9774:
Platt, John; Steinkraus, Dave; Simard, Patrice Y. (August 2003).
9623: 9281:"Imagenet classification with deep convolutional neural networks" 8927:"A theoretical analysis of feature pooling in visual recognition" 8620:"ImageNet Classification with Deep Convolutional Neural Networks" 8378:"ImageNet classification with deep convolutional neural networks" 7973:
Oh, KS; Jung, K (2004). "GPU implementation of neural networks".
7758: 7522:
Connectionist Architectures for Multi-Speaker Phoneme Recognition
7155:
Brain and visual perception: the story of a 25-year collaboration
7105:"Receptive fields of single neurones in the cat's striate cortex" 6709:"ImageNet Classification with Deep Convolutional Neural Networks" 5731: 5169: 5071: 4954: 4778: 4774: 4713:
of the input images, which delivers excellent performance on the
4263: 2822: 2773: 2766: 2482: 1915: 1907: 1276: 731: 11362: 10162:"The Face Detection Algorithm Set To Revolutionize Image Search" 9515:
E, Sabour, Sara Frosst, Nicholas Hinton, Geoffrey (2017-10-26).
7789:
Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. (November 1998).
6742: 6023:. Vol. 2020. Montreal, QC, Canada: IEEE. pp. 142–145. 1208:. The "full connectivity" of these networks makes them prone to 12580: 12560: 12432: 12224: 10614: 8128: 7922:. Lecture Notes in Computer Science. Vol. 2766. Springer. 6780:
2012 IEEE Conference on Computer Vision and Pattern Recognition
5331: 5148: 4916: 4439:
is a process of introducing additional information to solve an
2160:
across the width and height of the input volume, computing the
1644: 482: 11276: 10236:
IEEE Transactions on Pattern Analysis and Machine Intelligence
9034:
IEEE Transactions on Pattern Analysis and Machine Intelligence
8440:
He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing; Sun, Jian (2016).
7002:
Odena, Augustus; Dumoulin, Vincent; Olah, Chris (2016-10-17).
5289:
gaming. Other deep reinforcement learning models preceded it.
5094:
for multiple disease targets, most notably treatments for the
5070:. Predicting the interaction between molecules and biological 4337:
before downsampling operations, spatial transformer networks,
3408:
to pooling. Specifically, it takes as input a list of vectors
2620: 2190: 1967: 12381: 12361: 12351: 12346: 12341: 12336: 12299: 12131: 10435: 8733:"CS231n Convolutional Neural Networks for Visual Recognition" 8348:"History of computer vision contests won by deep CNNs on GPU" 5934: 5844: 5389: 5323: 5241:. However, human interpretable explanations are required for 2123: 1899: 1880: 1799: 1330:(number of inputs) × (input height) × (input width) × (input 1272: 1194:, due to the downsampling operation they apply to the input. 726: 721: 448: 11782:
CS231n: Convolutional Neural Networks for Visual Recognition
11781: 11051: 9889:
J. Hinton, Coursera lectures on Neural Networks, 2012, Url:
8885:
Yu, Dingjun; Wang, Hanli; Chen, Peiqiu; Wei, Zhihua (2014).
7851:
Applications of neural networks to medical signal processing
7492:"Convolutional networks for images, speech, and time series" 7284:(2022). "Annotated History of Modern AI and Deep Learning". 5887: 5537:
LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey (2015-05-28).
5090:. Subsequently, AtomNet was used to predict novel candidate 1944:-level parallelism that is available on the Intel Xeon Phi. 1910:, a similar GPU-based CNN by Alex Krizhevsky et al. won the 1473:
consists of iteratively adjusting these biases and weights.
1247:
CNNs use relatively little pre-processing compared to other
12371: 11450: 11377: 10676: 10351: 8764:
Scherer, Dominik; Müller, Andreas C.; Behnke, Sven (2010).
7600:
Backpropagation Applied to Handwritten Zip Code Recognition
5394: 5355: 1941: 1808:
et al. in 1995, classifies hand-written numbers on checks (
1291:
A convolutional neural network consists of an input layer,
10190: 9173:
Gholamalinezhad, Hossein; Khosravi, Hossein (2020-09-16),
9084: 8164: 7577:
Neural network recognizer for hand-written zip code digits
7040: 6009: 5237:
End-to-end training and prediction are common practice in
2758:{\displaystyle f_{X,Y}(S)=\max _{a,b=0}^{1}S_{2X+a,2Y+b}.} 9131: 8800:
Graham, Benjamin (2014-12-18). "Fractional Max-Pooling".
8701: 8699: 8572: 8375: 7791:"Gradient-based learning applied to document recognition" 6871: 1930: 1876: 10861: 10474: 7788: 6533: 6531: 4927:
imposes coordinate frames in order to represent shapes.
4315: 3394:
applies max pooling (or any other form of pooling) in a
1588: 1014:
List of datasets in computer vision and image processing
10128: 9172: 8925:
Boureau, Y-Lan; Ponce, Jean; LeCun, Yann (2010-06-21).
7889:
Qiu Huang, Daniel Graupe, Yi Fang Huang, Ruey Wen Liu."
6777: 6088:
2017 IEEE 19th Conference on Business Informatics (CBI)
5201:, benchmark datasets are becoming available, including 5172:, the first to beat the best human player at the time. 4303:. Often, non-overlapping pooling windows perform best. 4045:
loss function is used for predicting a single class of
1804:
LeNet-5, a pioneering 7-level convolutional network by
1754:
Image recognition with CNNs trained by gradient descent
11750:"Google Built Its Very Own Chips to Power Its AI Bots" 11140: 9211:
Research Bulletin of NTUU "Kyiv Polytechnic Institute"
8696: 7541: 7332:
1993 (4th) International Conference on Computer Vision
7304: 7190:
LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey (2015).
3914: 3739: 3680:{\displaystyle \mathrm {MultiheadedAttention} (Q,V,V)} 2495: 2308: 2271: 1879:. In 2005, another paper also emphasised the value of 1529: 1236:
respond to stimuli only in a restricted region of the
11534: 11325: 10085: 9773: 9279:
Krizhevsky, A.; Sutskever, I.; Hinton, G. E. (2012).
8759: 8757: 7916:
Hierarchical Neural Networks for Image Interpretation
7891:
Identification of firing patterns of neuronal signals
6528: 5845:
Homma, Toshiteru; Les Atlas; Robert Marks II (1987).
4888: 4840: 4811: 4655: 4613: 4589: 4569: 4542: 4519: 4496: 4470: 4110: 4066: 3854: 3807: 3693: 3593: 3501: 3473: 3414: 3364: 3323: 3275: 3249: 3223: 3128: 3061: 3034: 3002: 2972: 2946: 2920: 2831: 2662: 2541: 2434: 2411: 2391: 2371: 2351: 1511: 1394: 1251:. This means that the network learns to optimize the 11696: 11400:
Artificial Intelligence Applications and Innovations
11348:
Durjoy Sen Maitra; Ujjwal Bhattacharya; S.K. Parui,
11245: 10840: 10774: 8046:
Kumar Chellapilla; Sid Puri; Patrice Simard (2006).
7498:(Second ed.). The MIT press. pp. 276–278. 7470:
Phoneme Recognition Using Time-Delay Neural Networks
7448:
Phoneme Recognition Using Time-Delay Neural Networks
7253:
IEEE Transactions on Systems Science and Cybernetics
6892: 5635: 5184: 2049:. Weight sharing dramatically reduces the number of 1359:
1D convolutional neural network feed forward example
11281:(in German), Sydney, Australien, pp. 148–153, 9439:
Making Convolutional Networks Shift-Invariant Again
8775:. Thessaloniki, Greece: Springer. pp. 92–101. 8763: 6980:
A guide to convolution arithmetic for deep learning
6080: 6078: 5232: 3707:is a matrix of trainable parameters. It is used in 2815:is a linear sum of maxpooling and average pooling. 1716:
Conference on Neural Information Processing Systems
1388:is a standard convolution restricted to the use of 1224:processes in that the connectivity pattern between 56:. Unsourced material may be challenged and removed. 11792:computer science course on CNNs in computer vision 10478:Convolutional Learning of Spatio-temporal Features 9369: 8977: 8754: 8201:Raina, R; Madhavan, A; Ng, Andrew (14 June 2009). 8008:Dave Steinkraus; Patrice Simard; Ian Buck (2005). 6978:Dumoulin, Vincent; Visin, Francesco (2018-01-11), 6940: 6738: 6736: 6405: 6403: 6235: 4894: 4874: 4826: 4667: 4626: 4595: 4575: 4555: 4525: 4502: 4482: 4131: 4084: 4034:data labels (during supervised learning). Various 3967: 3897: 3840: 3775: 3699: 3679: 3579: 3487: 3459: 3379: 3350: 3287: 3261: 3235: 3209: 3100: 3047: 3014: 2984: 2958: 2932: 2906: 2757: 2553: 2527: 2470: 2417: 2397: 2377: 2357: 2320: 2286: 1562: 1406: 10441: 10335: 9630:IEEE Journal of Biomedical and Health Informatics 9470:Advances in Neural Information Processing Systems 9288:Advances in Neural Information Processing Systems 9176:Pooling Methods in Deep Neural Networks, a Review 8933:. ICML'10. Madison, WI, USA: Omnipress: 111–118. 8924: 8887:"Mixed Pooling for Convolutional Neural Networks" 8174:Advances in Neural Information Processing Systems 8138:Advances in Neural Information Processing Systems 8075: 7369: 7367: 7189: 7001: 6784:Institute of Electrical and Electronics Engineers 6361: 6359: 6357: 5854:Advances in Neural Information Processing Systems 5536: 4981:ImageNet Large Scale Visual Recognition Challenge 4959:ImageNet Large Scale Visual Recognition Challenge 4905: 1924: 1912:ImageNet Large Scale Visual Recognition Challenge 1826: 1182:, based on the shared-weight architecture of the 12727: 11667:"Convolutional Deep Belief Networks on CIFAR-10" 11456: 11397: 11182: 10975: 10940: 10913:"Startup Harnesses Supercomputers to Seek Cures" 10697: 10357: 10294: 7849:Daniel Graupe, Ruey Wen Liu, George S Moschytz." 7496:The handbook of brain theory and neural networks 7246: 7244: 6947:2011 International Conference on Computer Vision 6075: 4679:are chosen at random during the training stage. 3755: 3351:{\displaystyle \mathbb {R} ^{H\times W\times C}} 2692: 11098: 10718: 10584: 8200: 6733: 6400: 5968: 3992:. Their activations can thus be computed as an 3101:{\displaystyle {\frac {a_{i}}{\sum _{j}a_{j}}}} 2616:Worked example of 2x2 maxpooling with stride 2. 2385:of the convolutional layer neurons, the stride 11206:Journal of Systems Engineering and Electronics 9988: 9908: 9891:https://www.coursera.org/learn/neural-networks 8954:Zeiler, Matthew D.; Fergus, Rob (2013-01-15), 8841: 8442:"Deep Residual Learning for Image Recognition" 7364: 7325: 6354: 5944:. Curran Associates, Inc. pp. 2643–2651. 5673: 5033: 4299:of the signal, and may result in unacceptable 1914:2012. It was an early catalytic event for the 1009:List of datasets for machine-learning research 11817: 11540: 11203: 10978:IEEE Transactions on Evolutionary Computation 9027: 8884: 8439: 8131:"Greedy Layer-Wise Training of Deep Networks" 7513: 7440: 7438: 7241: 7152:David H. Hubel and Torsten N. Wiesel (2005). 6977: 6090:. Thessaloniki, Greece: IEEE. pp. 7–12. 4209:than a standard multilayer perceptron (MLP). 4038:can be used, depending on the specific task. 1846: 1725: 1476:The vectors of weights and biases are called 1042: 11831: 11030: 10777:International Conference on Machine Learning 10512: 9749: 9539:: CS1 maint: multiple names: authors list ( 9500:: CS1 maint: multiple names: authors list ( 9318: 8953: 8511: 8010:"Using GPUs for Machine Learning Algorithms" 7458: 7276: 7274: 6600: 5595:Venkatesan, Ragav; Li, Baoxin (2017-10-23). 5594: 4857: 4841: 4447:. CNNs use various types of regularization. 4360:are applied. Other strategies include using 2624:Max pooling with a 2x2 filter and stride = 2 2489:way. In general, setting zero padding to be 2341:wraps around to the other side of the image. 1773:This approach became a foundation of modern 1645:Neocognitron, origin of the CNN architecture 11659: 11612: 11251: 10799: 10140:Large Scale Visual Recognition Challenge". 10053: 9553: 8660: 8345: 7489: 7373: 7280: 7158:. Oxford University Press US. p. 106. 7102: 6658: 6656: 6654: 6409: 6238:"Stride and Translation Invariance in CNNs" 5973:. New York, NY, US: ACM. pp. 160–167. 5825:) CS1 maint: multiple names: authors list ( 5793:. Heravi, Elnaz Jahani. Cham, Switzerland. 5680:IEEE Transactions on Industrial Informatics 1365:fully connected feedforward neural networks 11824: 11810: 11316: 10233: 10079: 9231: 9006: 7895:https://ieeexplore.ieee.org/document/70115 7739: 7737: 7678: 7676: 7617: 7615: 7435: 7004:"Deconvolution and Checkerboard Artifacts" 6410:Hubel, D. H.; Wiesel, T. N. (1968-03-01). 5821:: CS1 maint: location missing publisher ( 5175: 2176:set of parameters that define the filter. 2030:—given that the layer has a stride of one. 1947: 1684:Max-pooling is often used in modern CNNs. 1303:, and its activation function is commonly 1180:space invariant artificial neural networks 1049: 1035: 11747: 11706: 11626: 11546: 11503: 11485: 11407: 11236: 11230: 11188: 11167: 11161: 11146: 11125: 11104: 11057: 11036: 10867: 10846: 10809: 10784: 10724: 10703: 10682: 10641: 10620: 10569: 10522: 10404: 10386: 10341: 10300: 10247: 10200: 10145: 10095: 9955: 9809: 9755: 9641: 9383: 9372:"Tracking Translation Invariance in CNNS" 9328: 9222: 9198: 9184: 9149: 9098: 9045: 9012: 8963: 8844:IEEE Transactions on Vehicular Technology 8826: 8805: 8586: 8456: 8277: 8089: 7391: 7310: 7289: 7271: 7250: 7128: 7076: 7058: 6987: 6898: 6877: 6856: 6801: 6791: 6642: 6469: 6443: 6391: 6365: 6337: 6327: 6249: 4424:Learn how and when to remove this message 4367: 4341:, subsampling combined with pooling, and 4193:Learn how and when to remove this message 3730:in 1969. ReLU applies the non-saturating 3367: 3326: 2302:units at a time per output. In practice, 2107:Learn how and when to remove this message 1434: 1373:exploding gradients problems seen during 1228:resembles the organization of the animal 116:Learn how and when to remove this message 11310: 11270: 11119: 9204: 7519:John B. Hampshire and Alexander Waibel, 7319: 6916:"LeNet-5, convolutional neural networks" 6850: 6773: 6771: 6769: 6767: 6651: 6465: 6463: 6301: 5264: 4280: 3982: 3460:{\displaystyle v_{1},v_{2},\dots ,v_{n}} 2789: 2619: 2611: 2335: 2189: 2135: 2122: 1966: 1850: 1354: 1266: 1204:is connected to all neurons in the next 10090:. IEEE Computer Society. pp. 1–9. 9902: 9435: 7734: 7673: 7612: 6636: 5938:Deep content-based music recommendation 5891:Mathematics and Computers in Simulation 5840: 5838: 5836: 5784: 5782: 5631: 5629: 5590: 5588: 5292: 4875:{\displaystyle \|{\vec {w}}\|_{2}<c} 4796: 4755: 3968:{\textstyle \sigma (x)=(1+e^{-x})^{-1}} 2172:at some spatial position in the input. 1929:Compared to the training of CNNs using 1601:in the 1950s and 1960s showed that cat 1318: 1085:(or kernel) optimization. This type of 12728: 11010:. San Francisco, CA: Morgan Kaufmann. 10739: 10172:from the original on 20 September 2020 9853: 9769: 9767: 8799: 8617: 8371: 8369: 7972: 7912: 7753:. World Scientific. pp. 261–276. 7535: 7444: 7103:Hubel, DH; Wiesel, TN (October 1959). 6703: 5788: 3714:See for reviews for pooling methods. 3288:{\displaystyle \beta \uparrow \infty } 2798: 2471:{\displaystyle {\frac {W-K+2P}{S}}+1.} 2229: 2131: 1933:, not much attention was given to the 1863: 1709: 1668:In 1969, Fukushima had introduced the 11805: 11760:from the original on January 13, 2018 11690: 11002: 10656:Collobert, Ronan, and Jason Weston. " 10041:from the original on 24 February 2021 9923:from the original on 12 December 2019 9245:Deep sparse rectifier neural networks 8920: 8918: 8705: 8656: 8654: 8652: 8650: 8648: 8512:Viebke, Andre; Pllana, Sabri (2015). 8354:from the original on 19 December 2018 8346:Schmidhuber, Jürgen (17 March 2017). 8318:"IJCNN 2011 Competition result table" 8194: 7819: 7682: 7621: 7594: 7592: 7326:Weng, J; Ahuja, N; Huang, TS (1993). 7098: 7096: 6922:from the original on 24 February 2021 6907: 6764: 6576:from the original on 13 December 2013 6460: 6231: 6229: 6227: 6225: 6223: 6221: 6166: 6162: 6160: 6158: 6131: 6127: 6125: 6123: 4682: 4316:Translation equivariance and aliasing 3587:, then sends the resulting matrix to 3495:on each vector resulting in a matrix 2206:The extent of this connectivity is a 2185: 1786:several TDNNs per word, one for each 1720:signal-processing concept of a filter 1589:Receptive fields in the visual cortex 1279:convolution, pooling and dense layers 1105:and exploding gradients, seen during 12662:Generative adversarial network (GAN) 11561: 11555: 11008:Blondie24: Playing at the Edge of AI 10067:from the original on 5 February 2016 10004:IEEE Transactions on Neural Networks 9944:IEEE Transactions on Neural Networks 9690:Journal of Machine Learning Research 8667:Journal of Machine Learning Research 8661:Azulay, Aharon; Weiss, Yair (2019). 8245:from the original on 8 December 2020 7490:LeCun, Yann; Bengio, Yoshua (1995). 6302:Kurtzman, Thomas (August 20, 2019). 5833: 5789:Habibi, Aghdam, Hamed (2017-05-30). 5779: 5626: 5585: 5504:hence the name "convolutional layer" 5313: 5053: 4935: 4406:adding citations to reliable sources 4377: 4237: 4175:adding citations to reliable sources 4146: 3108:. It is the same as average pooling 2565: 2298:means that the filter is translated 2089:adding citations to reliable sources 2060: 1120:Some applications of CNNs include: 54:adding citations to reliable sources 25: 10635: 9764: 8971: 8891:Rough Sets and Knowledge Technology 8366: 5189:As archaeological findings such as 5110:CNNs have been used in the game of 2214:of the neuron. The connections are 1971:CNN layers arranged in 3 dimensions 1004:Glossary of artificial intelligence 13: 8980:"14.8. Region-based CNNs (R-CNNs)" 8915: 8710:. Sebastopol, CA: O'Reilly Media. 8645: 7589: 7093: 6721:from the original on 25 April 2021 6596:https://arxiv.org/abs/2108.11663v3 6218: 6155: 6120: 5304:Convolutional deep belief networks 5269: 4720: 4142: 4132:{\displaystyle (-\infty ,\infty )} 4123: 4117: 4060:independent probability values in 4004:of a learned or fixed bias term). 3652: 3649: 3646: 3643: 3640: 3637: 3634: 3631: 3628: 3625: 3622: 3619: 3616: 3613: 3610: 3607: 3604: 3601: 3598: 3595: 3554: 3551: 3548: 3518: 3515: 3512: 3481: 3478: 3475: 3282: 3262:{\displaystyle \beta \downarrow 0} 2979: 2821:is like average pooling, but uses 2056: 1491: 1446: 14: 12757: 11775: 10742:Neural Computing and Applications 9514: 6913: 6685:from the original on 5 April 2022 5455:Scale-invariant feature transform 5185:Cultural heritage and 3D-datasets 5061: 5038:CNNs have also been explored for 5004: 4740: 4018:Loss functions for classification 3243:. Average pooling is the case of 2807: 2405:, and the amount of zero padding 1734:(TDNN) was introduced in 1987 by 1417: 12700: 12699: 12679: 11741: 11606: 11342: 11224: 11197: 11176: 11155: 11134: 11113: 11092: 11066: 11045: 11024: 10996: 10969: 10934: 10905: 10876: 10855: 10834: 10793: 10768: 10733: 10712: 10691: 10670: 10650: 10629: 10608: 10578: 10557: 10506: 10468: 10329: 10309: 10288: 10227: 10184: 10154: 10122: 9935: 9883: 9874: 9847: 9818: 9797: 9743: 9714: 9674: 9617: 9547: 9518:Dynamic Routing Between Capsules 9508: 9454: 9376:Artificial Intelligence Research 9339:10.1109/ICASSP39728.2021.9414627 7778:from the original on 2 May 2023. 6516:from the original on 3 June 2014 6242:Artificial Intelligence Research 5233:Human interpretable explanations 5197:are increasingly acquired using 5105: 4382: 4326:Nyquist-Shannon sampling theorem 4151: 3380:{\displaystyle \mathbb {R} ^{C}} 3269:, and maxpooling is the case of 2966:, and maxpooling is the case of 2601: 2065: 30: 11679:from the original on 2017-08-30 11523:from the original on 2018-09-13 10923:from the original on 2018-12-06 10894:from the original on 2015-10-20 10597:from the original on 2017-09-15 10495:from the original on 2022-03-31 10424:from the original on 2021-03-01 9909:Dave Gershgorn (18 June 2018). 9836:from the original on 2016-03-05 9786:from the original on 2017-11-07 9732:from the original on 2017-08-12 9703:from the original on 2016-01-19 9606:from the original on 2021-09-29 9483:from the original on 2021-07-25 9429: 9418:from the original on 2022-01-22 9363: 9312: 9301:from the original on 2022-03-31 9272: 9238:Xavier Glorot; Antoine Bordes; 9224:10.20535/1810-0546.2017.1.88156 9166: 9125: 9078: 9021: 9000: 8947: 8878: 8814: 8793: 8782:from the original on 2018-04-03 8743:from the original on 2019-10-23 8725: 8685:from the original on 2022-03-31 8634:from the original on 2019-12-20 8611: 8566: 8555:from the original on 2023-03-06 8525:. IEEE 2015. pp. 758–765. 8505: 8494:from the original on 2022-04-05 8433: 8422:from the original on 2017-05-16 8339: 8328:from the original on 2021-01-17 8310: 8257: 8183:from the original on 2016-03-22 8158: 8147:from the original on 2022-06-02 8122: 8069: 8058:from the original on 2020-05-18 8039: 8028:from the original on 2022-03-31 8001: 7966: 7955:from the original on 2017-08-10 7906: 7883: 7863: 7843: 7832:from the original on 2017-02-06 7813: 7782: 7723:from the original on 2017-02-06 7662:from the original on 2017-02-06 7569: 7558:from the original on 2018-07-08 7502:from the original on 2020-07-28 7483: 7424:from the original on 2016-04-19 7298: 7183: 7172:from the original on 2023-10-16 7145: 7034: 6995: 6971: 6934: 6886: 6865: 6844: 6697: 6630: 6588: 6295: 6284:from the original on 2021-06-27 6207:from the original on 2017-02-06 6144:from the original on 2020-06-23 6064:from the original on 2022-05-19 6003: 5962: 5951:from the original on 2022-03-07 5867:from the original on 2022-03-31 5764:from the original on 2023-06-29 5714:from the original on 2023-07-31 5656:from the original on 2023-10-16 5615:from the original on 2023-10-16 5507: 5498: 5477: 5131:at its "expert" level of play. 4930: 4768: 4393:needs additional citations for 4162:needs additional citations for 3898:{\displaystyle f(x)=|\tanh(x)|} 2653: 2076:needs additional citations for 1898:In 2010, Dan Ciresan et al. at 1262: 1249:image classification algorithms 41:needs additional citations for 12612:Recurrent neural network (RNN) 12602:Differentiable neural computer 9463:"Spatial Transformer Networks" 8531:10.1109/HPCC-CSS-ICESS.2015.45 8322:OFFICIAL IJCNN2011 COMPETITION 7494:. In Arbib, Michael A. (ed.). 7445:Waibel, Alex (December 1987). 7121:10.1113/jphysiol.1959.sp006308 6428:10.1113/jphysiol.1968.sp008455 6029:10.1109/EMBC44109.2020.9176228 5928: 5725: 5667: 5530: 5216: 5143:. In December 2014, Clark and 4906:Hierarchical coordinate frames 4850: 4818: 4709:. This is similar to explicit 4644: 4257: 4212: 4126: 4111: 4079: 4067: 3953: 3930: 3924: 3918: 3891: 3887: 3881: 3871: 3864: 3858: 3835: 3829: 3817: 3811: 3770: 3758: 3749: 3743: 3674: 3656: 3574: 3571: 3558: 3535: 3522: 3508: 3488:{\displaystyle \mathrm {FFN} } 3467:, applies a feedforward layer 3279: 3253: 2976: 2875: 2859: 2685: 2679: 2514: 2502: 1925:Intel Xeon Phi implementations 1827:Shift-invariant neural network 1780: 1521: 1518: 1512: 424:Relevance vector machine (RVM) 65:"Convolutional neural network" 1: 12657:Variational autoencoder (VAE) 12617:Long short-term memory (LSTM) 11884:Computational learning theory 11798:. Animations of convolutions. 9868:10.1016/s0364-0213(79)80008-7 9436:Richard, Zhang (2019-04-25). 9087:"Scaling Vision Transformers" 8575:The Journal of Supercomputing 6559:10.1016/S0893-6080(03)00115-1 5524: 4992:) increased the mean average 4348: 4007: 3841:{\displaystyle f(x)=\tanh(x)} 3717: 3118:is like maxpooling, but uses 913:Computational learning theory 477:Expectation–maximization (EM) 12736:Neural network architectures 12637:Convolutional neural network 11376:. 2017-10-20. Archived from 10193:Human Behavior Unterstanding 9584:10.1016/j.patcog.2019.01.035 9394:10.1007/978-3-030-66151-9_18 9109:10.1109/CVPR52688.2022.01179 9093:. IEEE. pp. 1204–1213. 8899:10.1007/978-3-319-11740-9_34 7995:10.1016/j.patcog.2004.01.013 7609:; AT&T Bell Laboratories 7586:, AT&T Bell Laboratories 7374:Schmidhuber, Jürgen (2015). 6949:. IEEE. pp. 2018–2025. 6786:(IEEE). pp. 3642–3649. 6470:Fukushima, Kunihiko (1980). 6329:10.1371/journal.pone.0220113 6260:10.1007/978-3-030-66151-9_17 5903:10.1016/j.matcom.2020.04.031 5750:10.1016/j.neucom.2020.04.018 5430:Attention (machine learning) 4450: 4374:Regularization (mathematics) 4292:, reducing processing cost. 4056:loss is used for predicting 4049:mutually exclusive classes. 3722:ReLU is the abbreviation of 3315:Global Average Pooling (GAP) 3028:samples a random activation 2985:{\displaystyle p\to \infty } 2223: 1813: 1377:in earlier neural networks. 1216:Convolutional networks were 1198:Feed-forward neural networks 1063:convolutional neural network 870:Coefficient of determination 717:Convolutional neural network 429:Support vector machine (SVM) 7: 12632:Multilayer perceptron (MLP) 11418:10.1007/978-3-319-92007-8_9 11074:"AlphaGo – Google DeepMind" 10211:10.1007/978-3-642-25446-8_4 9374:. In Gerber, Aurona (ed.). 8100:10.1162/neco.2006.18.7.1527 7402:10.1162/neco.2006.18.7.1527 6240:. In Gerber, Aurona (ed.). 5445:Natural-language processing 5423: 5362:Microsoft Cognitive Toolkit 5207:GigaMesh Software Framework 5151:and win some games against 5076:structure-based drug design 5040:natural language processing 5034:Natural language processing 4834:of every neuron to satisfy 4735: 4306: 4000:followed by a bias offset ( 3776:{\textstyle f(x)=\max(0,x)} 3406:multiheaded attention block 3402:Multihead attention pooling 3236:{\displaystyle \beta >0} 2142:discrete Laplacian operator 1903:art on several benchmarks. 1155:natural language processing 1125:image and video recognition 1075:feed-forward neural network 1021:Outline of machine learning 918:Empirical risk minimization 10: 12762: 12746:Computational neuroscience 12708:Artificial neural networks 12622:Gated recurrent unit (GRU) 11848:Differentiable programming 11748:Cade Metz (May 18, 2016). 11374:Interpretable ML Symposium 10754:10.1007/S00521-021-06190-5 10315:Karpathy, Andrej, et al. " 9056:10.1109/TPAMI.2015.2389824 7334:. IEEE. pp. 121–128. 6614:. LISA Lab. Archived from 5382:(TPU), and mobile devices. 5296: 5249:. With recent advances in 4827:{\displaystyle {\vec {w}}} 4803:projected gradient descent 4791:elastic net regularization 4744: 4724: 4455: 4371: 4266:used 3x3, 5x5, and 11x11. 4220: 4011: 3990:artificial neural networks 2914:instead of average, where 2605: 1855:Neural abstraction pyramid 1847:Neural abstraction pyramid 1797: 1793: 1726:Time delay neural networks 1576: 1467: 1284: 658:Feedforward neural network 409:Artificial neural networks 18: 12675: 12589: 12533: 12462: 12395: 12267: 12167: 12160: 12114: 12078: 12041:Artificial neural network 12021: 11897: 11864:Automatic differentiation 11837: 11701:. ACM. pp. 609–616. 10533:10.1109/CVPR.2011.5995496 10452:10.1109/icip.2018.8451692 10106:10.1109/CVPR.2015.7298594 9728:: 1058–1066. 2013-02-13. 9643:10.1109/JBHI.2020.2996300 9251:. AISTATS. Archived from 8618:Hinton, Geoffrey (2012). 8597:10.1007/s11227-017-1994-x 8385:Communications of the ACM 8050:. In Lorette, Guy (ed.). 7060:10.3389/fnins.2021.750639 7047:Frontiers in Neuroscience 6955:10.1109/iccv.2011.6126474 6812:10.1109/CVPR.2012.6248110 6416:The Journal of Physiology 6393:10.4249/scholarpedia.1717 5460:Time delay neural network 5048:recurrent neural networks 4490:or kept with probability 4229: 3392:Spatial pyramidal pooling 2594:of the CNN architecture. 2481:If this number is not an 2201:sparse local connectivity 1952:In the past, traditional 1870:graphics processing units 1732:time delay neural network 1480:and represent particular 1431:takes the average value. 1407:{\displaystyle 1\times 1} 1323:In a CNN, the input is a 1161:brain–computer interfaces 641:Artificial neural network 16:Artificial neural network 11869:Neuromorphic engineering 11832:Differentiable computing 11287:10.1109/ICDAR.2019.00032 11218:10.21629/JSEE.2017.01.18 9205:Romanuke, Vadim (2017). 8856:10.1109/tvt.2019.2899972 8706:Géron, Aurélien (2019). 7340:10.1109/ICCV.1993.378228 7265:10.1109/TSSC.1969.300225 5897:. Elsevier BV: 232–243. 5692:10.1109/TII.2019.2956078 5470: 4777:) or squared magnitude ( 4696:multinomial distribution 4513:In the training stages, 4270:used 1x1, 3x3, and 5x5. 2365:, the kernel field size 2194:Typical CNN architecture 2180:Self-supervised learning 2039:translational invariance 1739:gradient descent, using 1672:(rectified linear unit) 1192:invariant to translation 950:Journals and conferences 897:Mathematical foundations 807:Temporal difference (TD) 663:Recurrent neural network 583:Conditional random field 506:Dimensionality reduction 254:Dimensionality reduction 216:Quantum machine learning 211:Neuromorphic engineering 171:Self-supervised learning 166:Semi-supervised learning 12642:Residual neural network 12058:Artificial Intelligence 11717:10.1145/1553374.1553453 10820:10.1145/3292500.3330680 10016:10.1109/TNN.2006.879766 8984:Dive into deep learning 8218:10.1145/1553374.1553486 7795:Proceedings of the IEEE 5979:10.1145/1390156.1390177 5176:Time series forecasting 5163:Monte Carlo tree search 5153:Monte Carlo tree search 5139:CNNs have been used in 5066:CNNs have been used in 4940:CNNs are often used in 4343:capsule neural networks 2933:{\displaystyle p\geq 1} 2636:is the most common. It 1958:curse of dimensionality 1948:Distinguishing features 1301:Frobenius inner product 1174:CNNs are also known as 359:Apprenticeship learning 10943:IEEE Trans Neural Netw 9323:. pp. 2755–2759. 8022:10.1109/ICDAR.2005.251 8016:. pp. 1115–1119. 7020:10.23915/distill.00003 6479:Biological Cybernetics 6366:Fukushima, K. (2007). 5465:Vision processing unit 5380:tensor processing unit 5280:reinforcement learning 5012:Long short-term memory 4974:root mean square error 4896: 4876: 4828: 4669: 4628: 4597: 4577: 4557: 4527: 4504: 4484: 4368:Regularization methods 4358:-fold cross-validation 4133: 4086: 4022:The "loss layer", or " 3969: 3899: 3842: 3777: 3701: 3681: 3581: 3489: 3461: 3381: 3352: 3289: 3263: 3237: 3211: 3102: 3049: 3016: 2986: 2960: 2934: 2908: 2795: 2759: 2717: 2625: 2617: 2592:translation invariance 2555: 2529: 2528:{\textstyle P=(K-1)/2} 2472: 2419: 2399: 2379: 2359: 2342: 2322: 2288: 2195: 2145: 2128: 1972: 1856: 1564: 1435:Fully connected layers 1408: 1360: 1282: 1149:medical image analysis 908:Bias–variance tradeoff 790:Reinforcement learning 766:Spiking neural network 176:Reinforcement learning 12597:Neural Turing machine 12185:Human image synthesis 10258:10.1109/TPAMI.2012.59 10168:. February 16, 2015. 7913:Behnke, Sven (2003). 5348:on multi-GPU-enabled 5265:Related architectures 5134: 5114:. From 1999 to 2001, 5020:Unsupervised learning 4944:systems. In 2012, an 4897: 4877: 4829: 4670: 4629: 4627:{\displaystyle 2^{n}} 4598: 4578: 4558: 4556:{\displaystyle 2^{n}} 4528: 4505: 4485: 4297:reduces the dimension 4281:Pooling type and size 4134: 4087: 3998:matrix multiplication 3994:affine transformation 3983:Fully connected layer 3970: 3900: 3843: 3778: 3724:rectified linear unit 3702: 3682: 3582: 3490: 3462: 3382: 3353: 3290: 3264: 3238: 3212: 3103: 3050: 3048:{\displaystyle a_{i}} 3017: 2987: 2961: 2935: 2909: 2793: 2760: 2691: 2623: 2615: 2556: 2530: 2473: 2420: 2400: 2380: 2360: 2339: 2323: 2289: 2193: 2139: 2126: 1984:locality of reference 1970: 1954:multilayer perceptron 1854: 1693:unsupervised learning 1565: 1458:entire previous layer 1441:multilayer perceptron 1409: 1386:pointwise convolution 1382:depthwise convolution 1358: 1287:Layer (deep learning) 1270: 1093:-based approaches to 744:Neural radiance field 566:Structured prediction 289:Structured prediction 161:Unsupervised learning 12688:Computer programming 12667:Graph neural network 12242:Text-to-video models 12220:Text-to-image models 12068:Large language model 12053:Scientific computing 11859:Statistical manifold 11854:Information geometry 11265:10.11588/data/IE8CCN 8467:10.1109/CVPR.2016.90 8451:. pp. 770–778. 8288:10.1162/NECO_a_00052 7648:10.1364/AO.30.004211 7282:Schmidhuber, Juergen 6193:10.1364/AO.29.004790 5409:scientific computing 5308:deep belief networks 5293:Deep belief networks 4886: 4882:. Typical values of 4838: 4809: 4797:Max norm constraints 4756:Number of parameters 4711:elastic deformations 4653: 4639:deep neural networks 4611: 4587: 4567: 4540: 4517: 4494: 4468: 4402:improve this article 4362:conformal prediction 4171:improve this article 4108: 4064: 3912: 3852: 3805: 3737: 3691: 3591: 3499: 3471: 3412: 3362: 3321: 3273: 3247: 3221: 3126: 3059: 3032: 3000: 2970: 2944: 2918: 2829: 2765:In this case, every 2660: 2539: 2493: 2432: 2409: 2389: 2369: 2349: 2321:{\textstyle S\geq 3} 2306: 2287:{\textstyle S>0,} 2269: 2242:, and padding size: 2085:improve this article 2033:Pooling: In a CNN's 1893:deep belief networks 1678:deep neural networks 1653:" was introduced by 1633:, which have larger 1509: 1392: 1319:Convolutional layers 1137:image classification 933:Statistical learning 831:Learning with humans 623:Local outlier factor 50:improve this article 21:CNN (disambiguation) 19:For other uses, see 12034:In-context learning 11874:Pattern recognition 11637:10.1109/3477.846230 11584:10.1038/nature14236 11576:2015Natur.518..529M 11478:2018Senso..18.1979W 11321:, Dortmund, Germany 10990:10.1109/4235.942536 10748:(12): 15709–15718. 10379:2018Senso..18.1657W 9576:2019PatRe..90..172M 9564:Pattern Recognition 9151:10.3390/app12178643 7987:2004PatRe..37.1311O 7975:Pattern Recognition 7820:Zhang, Wei (1991). 7701:1994MedPh..21..517Z 7683:Zhang, Wei (1994). 7640:1991ApOpt..30.4211Z 7622:Zhang, Wei (1991). 7219:10.1038/nature14539 7211:2015Natur.521..436L 6618:on 28 December 2017 6384:2007SchpJ...2.1717F 6320:2019PLoSO..1420113C 6185:1990ApOpt..29.4790Z 6167:Zhang, Wei (1990). 6132:Zhang, Wei (1988). 6096:10.1109/CBI.2017.23 5642:. Springer Nature. 5563:10.1038/nature14539 5555:2015Natur.521..436L 5493:mathematical spaces 5340:: Deep learning in 5299:Deep belief network 5028:text-to-video model 4988:(the foundation of 4668:{\displaystyle 1-p} 4483:{\displaystyle 1-p} 3732:activation function 3709:vision transformers 3015:{\displaystyle p=2} 2994:Square-root pooling 2959:{\displaystyle p=1} 2799:Channel max pooling 2554:{\displaystyle S=1} 2535:when the stride is 2230:Spatial arrangement 2132:Convolutional layer 1864:GPU implementations 1710:Convolution in time 1674:activation function 1368:10,000 weights for 1131:recommender systems 1103:Vanishing gradients 776:Electrochemical RAM 683:reservoir computing 414:Logistic regression 333:Supervised learning 319:Multimodal learning 294:Feature engineering 239:Generative modeling 201:Rule-based learning 196:Curriculum learning 156:Supervised learning 131:Part of a series on 12627:Echo state network 12515:Jürgen Schmidhuber 12210:Facial recognition 12205:Speech recognition 12115:Software libraries 11355:2023-10-16 at the 11080:on 30 January 2016 10917:KQED Future of You 10888:The Globe and Mail 10663:2019-09-04 at the 10591:Microsoft Research 10322:2019-08-06 at the 9896:2016-12-31 at the 9780:Microsoft Research 9556:Barner, Kenneth E. 8266:Neural Computation 8078:Neural Computation 7900:2022-03-31 at the 7876:2019-09-04 at the 7856:2020-07-28 at the 7605:2020-01-10 at the 7582:2018-08-04 at the 7527:2022-03-31 at the 7475:2021-02-25 at the 6491:10.1007/BF00344251 5259:temporal attention 5100:multiple sclerosis 5024:Boltzmann Machines 4966:facial recognition 4892: 4872: 4824: 4683:Stochastic pooling 4665: 4624: 4593: 4573: 4553: 4523: 4500: 4480: 4328:and might lead to 4129: 4082: 3965: 3895: 3838: 3800:hyperbolic tangent 3773: 3728:Kunihiko Fukushima 3697: 3677: 3580:{\displaystyle V=} 3577: 3485: 3457: 3377: 3348: 3299:Region of Interest 3285: 3259: 3233: 3207: 3183: 3141: 3098: 3084: 3045: 3026:Stochastic pooling 3012: 2982: 2956: 2930: 2904: 2857: 2796: 2755: 2626: 2618: 2551: 2525: 2468: 2415: 2395: 2375: 2355: 2343: 2318: 2284: 2196: 2186:Local connectivity 2146: 2129: 1973: 1857: 1655:Kunihiko Fukushima 1560: 1554: 1404: 1361: 1283: 1271:Comparison of the 1143:image segmentation 344: • 259:Density estimation 12723: 12722: 12485:Stephen Grossberg 12458: 12457: 11570:(7540): 529–533. 11487:10.3390/s18071979 11427:978-3-319-92006-1 11296:978-1-7281-3014-9 10955:10.1109/72.809083 10542:978-1-4577-0394-2 10488:978-3-642-15566-6 10461:978-1-4799-7061-2 10388:10.3390/s18051657 10220:978-3-642-25445-1 10166:Technology Review 10130:Russakovsky, Olga 10115:978-1-4673-6964-0 9966:10.1109/72.554195 9856:Cognitive Science 9490:– via NIPS. 9403:978-3-030-66151-9 9348:978-1-7281-7605-5 9118:978-1-6654-6946-3 8993:978-1-009-38943-3 8940:978-1-60558-907-7 8908:978-3-319-11740-9 8717:978-1-492-03264-9 8540:978-1-4799-8937-9 8476:978-1-4673-8851-1 8272:(12): 3207–3220. 7937:978-3-540-40722-5 7801:(11): 2278–2324. 7768:978-981-02-2324-3 7205:(7553): 436–444. 7165:978-0-19-517618-6 6964:978-1-4577-1102-2 6821:978-1-4673-1226-4 6269:978-3-030-66151-9 6105:978-1-5386-3035-8 6038:978-1-7281-1990-8 5988:978-1-60558-205-4 5649:978-3-030-32644-9 5608:978-1-351-65032-8 5549:(7553): 436–444. 5397:code for a fast, 5314:Notable libraries 5255:spatial attention 5247:self-driving cars 5227:transfer learning 5195:cuneiform writing 5054:Anomaly detection 4942:image recognition 4936:Image recognition 4895:{\displaystyle c} 4853: 4821: 4727:Data augmentation 4700:data augmentation 4596:{\displaystyle p} 4576:{\displaystyle n} 4526:{\displaystyle p} 4503:{\displaystyle p} 4441:ill-posed problem 4434: 4433: 4426: 4339:data augmentation 4238:Number of filters 4203: 4202: 4195: 4096:loss is used for 4026:", specifies how 3789:decision function 3700:{\displaystyle Q} 3396:pyramid structure 3205: 3174: 3132: 3096: 3075: 3055:with probability 2848: 2846: 2566:Parameter sharing 2460: 2418:{\displaystyle P} 2398:{\displaystyle S} 2378:{\displaystyle K} 2358:{\displaystyle W} 2117: 2116: 2109: 2015:nonlinear filters 1834:mammograms (1994) 1059: 1058: 864:Model diagnostics 847:Human-in-the-loop 690:Boltzmann machine 603:Anomaly detection 399:Linear regression 314:Ontology learning 309:Grammar induction 284:Semantic analysis 279:Association rules 264:Anomaly detection 206:Neuro-symbolic AI 126: 125: 118: 100: 12753: 12713:Machine learning 12703: 12702: 12683: 12438:Action selection 12428:Self-driving car 12235:Stable Diffusion 12200:Speech synthesis 12165: 12164: 12029:Machine learning 11905:Gradient descent 11826: 11819: 11812: 11803: 11802: 11770: 11769: 11767: 11765: 11745: 11739: 11738: 11710: 11694: 11688: 11687: 11685: 11684: 11678: 11671: 11663: 11657: 11656: 11630: 11610: 11604: 11603: 11559: 11553: 11552: 11550: 11538: 11532: 11531: 11529: 11528: 11522: 11507: 11489: 11463: 11454: 11448: 11447: 11411: 11395: 11389: 11388: 11386: 11385: 11366: 11360: 11346: 11340: 11334: 11329: 11323: 11322: 11314: 11308: 11307: 11274: 11268: 11267: 11249: 11243: 11242: 11240: 11228: 11222: 11221: 11201: 11195: 11194: 11192: 11180: 11174: 11173: 11171: 11159: 11153: 11152: 11150: 11138: 11132: 11131: 11129: 11117: 11111: 11110: 11108: 11096: 11090: 11089: 11087: 11085: 11076:. Archived from 11070: 11064: 11063: 11061: 11049: 11043: 11042: 11040: 11028: 11022: 11021: 11000: 10994: 10993: 10973: 10967: 10966: 10938: 10932: 10931: 10929: 10928: 10909: 10903: 10902: 10900: 10899: 10880: 10874: 10873: 10871: 10859: 10853: 10852: 10850: 10838: 10832: 10831: 10813: 10797: 10791: 10790: 10788: 10772: 10766: 10765: 10737: 10731: 10730: 10728: 10716: 10710: 10709: 10707: 10695: 10689: 10688: 10686: 10674: 10668: 10654: 10648: 10647: 10645: 10633: 10627: 10626: 10624: 10612: 10606: 10605: 10603: 10602: 10582: 10576: 10575: 10573: 10561: 10555: 10554: 10526: 10510: 10504: 10503: 10501: 10500: 10472: 10466: 10465: 10439: 10433: 10432: 10430: 10429: 10423: 10408: 10390: 10364: 10355: 10349: 10347: 10345: 10333: 10327: 10313: 10307: 10306: 10304: 10292: 10286: 10285: 10251: 10231: 10225: 10224: 10204: 10188: 10182: 10181: 10179: 10177: 10158: 10152: 10151: 10149: 10134:Karpathy, Andrej 10126: 10120: 10119: 10099: 10083: 10077: 10076: 10074: 10072: 10057: 10051: 10050: 10048: 10046: 10040: 10010:(5): 1316–1327. 10001: 9992: 9986: 9985: 9959: 9939: 9933: 9932: 9930: 9928: 9906: 9900: 9887: 9881: 9878: 9872: 9871: 9851: 9845: 9844: 9842: 9841: 9822: 9816: 9815: 9813: 9801: 9795: 9794: 9792: 9791: 9771: 9762: 9761: 9759: 9747: 9741: 9740: 9738: 9737: 9718: 9712: 9711: 9709: 9708: 9702: 9696:(1): 1929–1958. 9687: 9678: 9672: 9671: 9645: 9621: 9615: 9614: 9612: 9611: 9551: 9545: 9544: 9538: 9530: 9512: 9506: 9505: 9499: 9491: 9489: 9488: 9482: 9467: 9458: 9452: 9451: 9433: 9427: 9426: 9424: 9423: 9387: 9367: 9361: 9360: 9332: 9316: 9310: 9309: 9307: 9306: 9300: 9285: 9276: 9270: 9269: 9264: 9263: 9257: 9250: 9235: 9229: 9228: 9226: 9202: 9196: 9195: 9194: 9193: 9188: 9170: 9164: 9163: 9153: 9138:Applied Sciences 9129: 9123: 9122: 9102: 9082: 9076: 9075: 9049: 9040:(9): 1904–1916. 9025: 9019: 9018: 9016: 9004: 8998: 8997: 8975: 8969: 8968: 8967: 8951: 8945: 8944: 8922: 8913: 8912: 8882: 8876: 8875: 8839: 8833: 8832: 8830: 8818: 8812: 8811: 8809: 8797: 8791: 8790: 8788: 8787: 8781: 8770: 8761: 8752: 8751: 8749: 8748: 8737:cs231n.github.io 8729: 8723: 8721: 8703: 8694: 8693: 8691: 8690: 8658: 8643: 8642: 8641:– via ACM. 8640: 8639: 8615: 8609: 8608: 8590: 8570: 8564: 8563: 8561: 8560: 8509: 8503: 8502: 8500: 8499: 8493: 8460: 8446: 8437: 8431: 8430: 8428: 8427: 8421: 8382: 8373: 8364: 8363: 8361: 8359: 8343: 8337: 8336: 8334: 8333: 8314: 8308: 8307: 8281: 8261: 8255: 8254: 8252: 8250: 8244: 8207: 8198: 8192: 8191: 8189: 8188: 8182: 8171: 8162: 8156: 8155: 8153: 8152: 8146: 8135: 8126: 8120: 8119: 8093: 8073: 8067: 8066: 8064: 8063: 8043: 8037: 8036: 8034: 8033: 8005: 7999: 7998: 7981:(6): 1311–1314. 7970: 7964: 7963: 7961: 7960: 7954: 7921: 7910: 7904: 7887: 7881: 7867: 7861: 7847: 7841: 7840: 7838: 7837: 7817: 7811: 7810: 7807:10.1109/5.726791 7786: 7780: 7779: 7777: 7752: 7741: 7732: 7731: 7729: 7728: 7709:10.1118/1.597177 7680: 7671: 7670: 7668: 7667: 7619: 7610: 7596: 7587: 7573: 7567: 7566: 7564: 7563: 7557: 7550: 7539: 7533: 7517: 7511: 7510: 7508: 7507: 7487: 7481: 7465:Alexander Waibel 7462: 7456: 7455: 7453: 7442: 7433: 7432: 7430: 7429: 7395: 7371: 7362: 7361: 7323: 7317: 7316: 7314: 7302: 7296: 7295: 7293: 7278: 7269: 7268: 7248: 7239: 7238: 7196: 7187: 7181: 7180: 7178: 7177: 7149: 7143: 7142: 7132: 7100: 7091: 7090: 7080: 7062: 7038: 7032: 7031: 6999: 6993: 6992: 6991: 6975: 6969: 6968: 6938: 6932: 6931: 6929: 6927: 6911: 6905: 6904: 6902: 6890: 6884: 6883: 6881: 6869: 6863: 6862: 6860: 6848: 6842: 6841: 6805: 6795: 6782:. New York, NY: 6775: 6762: 6761: 6759: 6758: 6740: 6731: 6730: 6728: 6726: 6720: 6713: 6701: 6695: 6694: 6692: 6690: 6684: 6669: 6660: 6649: 6648: 6646: 6634: 6628: 6627: 6625: 6623: 6612:DeepLearning 0.1 6604: 6598: 6592: 6586: 6585: 6583: 6581: 6575: 6544: 6535: 6526: 6525: 6523: 6521: 6515: 6476: 6467: 6458: 6457: 6447: 6407: 6398: 6397: 6395: 6363: 6352: 6351: 6341: 6331: 6299: 6293: 6292: 6290: 6289: 6253: 6233: 6216: 6215: 6213: 6212: 6164: 6153: 6152: 6150: 6149: 6129: 6118: 6117: 6082: 6073: 6072: 6070: 6069: 6063: 6022: 6007: 6001: 6000: 5966: 5960: 5959: 5957: 5956: 5950: 5943: 5932: 5926: 5925: 5885: 5879: 5878: 5873: 5872: 5866: 5851: 5842: 5831: 5830: 5820: 5812: 5786: 5777: 5776: 5770: 5769: 5729: 5723: 5722: 5720: 5719: 5686:(9): 5769–5779. 5671: 5665: 5664: 5662: 5661: 5633: 5624: 5623: 5621: 5620: 5592: 5583: 5582: 5534: 5518: 5515:categorical data 5511: 5505: 5502: 5496: 5489:frequency domain 5481: 5243:critical systems 5088:hydrogen bonding 5044:semantic parsing 4964:When applied to 4948:of 0.23% on the 4901: 4899: 4898: 4893: 4881: 4879: 4878: 4873: 4865: 4864: 4855: 4854: 4846: 4833: 4831: 4830: 4825: 4823: 4822: 4814: 4674: 4672: 4671: 4666: 4633: 4631: 4630: 4625: 4623: 4622: 4602: 4600: 4599: 4594: 4582: 4580: 4579: 4574: 4562: 4560: 4559: 4554: 4552: 4551: 4532: 4530: 4529: 4524: 4509: 4507: 4506: 4501: 4489: 4487: 4486: 4481: 4429: 4422: 4418: 4415: 4409: 4386: 4378: 4301:information loss 4295:Greater pooling 4198: 4191: 4187: 4184: 4178: 4155: 4147: 4138: 4136: 4135: 4130: 4091: 4089: 4088: 4085:{\displaystyle } 4083: 3974: 3972: 3971: 3966: 3964: 3963: 3951: 3950: 3907:sigmoid function 3904: 3902: 3901: 3896: 3894: 3874: 3847: 3845: 3844: 3839: 3782: 3780: 3779: 3774: 3706: 3704: 3703: 3698: 3686: 3684: 3683: 3678: 3655: 3586: 3584: 3583: 3578: 3570: 3569: 3557: 3534: 3533: 3521: 3494: 3492: 3491: 3486: 3484: 3466: 3464: 3463: 3458: 3456: 3455: 3437: 3436: 3424: 3423: 3386: 3384: 3383: 3378: 3376: 3375: 3370: 3357: 3355: 3354: 3349: 3347: 3346: 3329: 3309:object detection 3294: 3292: 3291: 3286: 3268: 3266: 3265: 3260: 3242: 3240: 3239: 3234: 3216: 3214: 3213: 3208: 3206: 3204: 3203: 3202: 3201: 3200: 3182: 3172: 3171: 3170: 3161: 3160: 3159: 3158: 3140: 3130: 3107: 3105: 3104: 3099: 3097: 3095: 3094: 3093: 3083: 3073: 3072: 3063: 3054: 3052: 3051: 3046: 3044: 3043: 3021: 3019: 3018: 3013: 2991: 2989: 2988: 2983: 2965: 2963: 2962: 2957: 2939: 2937: 2936: 2931: 2913: 2911: 2910: 2905: 2903: 2902: 2898: 2889: 2885: 2884: 2883: 2878: 2872: 2871: 2862: 2856: 2847: 2839: 2764: 2762: 2761: 2756: 2751: 2750: 2716: 2711: 2678: 2677: 2646:memory footprint 2560: 2558: 2557: 2552: 2534: 2532: 2531: 2526: 2521: 2477: 2475: 2474: 2469: 2461: 2456: 2436: 2424: 2422: 2421: 2416: 2404: 2402: 2401: 2396: 2384: 2382: 2381: 2376: 2364: 2362: 2361: 2356: 2327: 2325: 2324: 2319: 2293: 2291: 2290: 2285: 2225: 2112: 2105: 2101: 2098: 2092: 2069: 2061: 1990:input patterns. 1975:For example, in 1885:machine learning 1841:electromyography 1815: 1635:receptive fields 1569: 1567: 1566: 1561: 1559: 1558: 1486:memory footprint 1413: 1411: 1410: 1405: 1234:cortical neurons 1051: 1044: 1037: 998:Related articles 875:Confusion matrix 628:Isolation forest 573:Graphical models 352: 351: 304:Learning to rank 299:Feature learning 137:Machine learning 128: 127: 121: 114: 110: 107: 101: 99: 58: 34: 26: 12761: 12760: 12756: 12755: 12754: 12752: 12751: 12750: 12741:Computer vision 12726: 12725: 12724: 12719: 12671: 12585: 12551:Google DeepMind 12529: 12495:Geoffrey Hinton 12454: 12391: 12317:Project Debater 12263: 12161:Implementations 12156: 12110: 12074: 12017: 11959:Backpropagation 11893: 11879:Tensor calculus 11833: 11830: 11786:Andrej Karpathy 11778: 11773: 11763: 11761: 11746: 11742: 11727: 11708:10.1.1.149.6800 11695: 11691: 11682: 11680: 11676: 11669: 11665: 11664: 11660: 11611: 11607: 11560: 11556: 11539: 11535: 11526: 11524: 11520: 11461: 11455: 11451: 11428: 11396: 11392: 11383: 11381: 11368: 11367: 11363: 11357:Wayback Machine 11347: 11343: 11332: 11330: 11326: 11315: 11311: 11297: 11275: 11271: 11250: 11246: 11229: 11225: 11202: 11198: 11181: 11177: 11160: 11156: 11139: 11135: 11118: 11114: 11097: 11093: 11083: 11081: 11072: 11071: 11067: 11050: 11046: 11029: 11025: 11018: 11001: 10997: 10974: 10970: 10939: 10935: 10926: 10924: 10911: 10910: 10906: 10897: 10895: 10882: 10881: 10877: 10860: 10856: 10839: 10835: 10798: 10794: 10773: 10769: 10738: 10734: 10717: 10713: 10696: 10692: 10675: 10671: 10665:Wayback Machine 10655: 10651: 10634: 10630: 10613: 10609: 10600: 10598: 10583: 10579: 10562: 10558: 10543: 10524:10.1.1.294.5948 10511: 10507: 10498: 10496: 10489: 10473: 10469: 10462: 10440: 10436: 10427: 10425: 10421: 10362: 10356: 10352: 10334: 10330: 10324:Wayback Machine 10314: 10310: 10293: 10289: 10249:10.1.1.169.4046 10232: 10228: 10221: 10202:10.1.1.385.4740 10189: 10185: 10175: 10173: 10160: 10159: 10155: 10127: 10123: 10116: 10084: 10080: 10070: 10068: 10059: 10058: 10054: 10044: 10042: 10038: 9999: 9993: 9989: 9940: 9936: 9926: 9924: 9907: 9903: 9898:Wayback Machine 9888: 9884: 9879: 9875: 9852: 9848: 9839: 9837: 9824: 9823: 9819: 9802: 9798: 9789: 9787: 9772: 9765: 9748: 9744: 9735: 9733: 9720: 9719: 9715: 9706: 9704: 9700: 9685: 9679: 9675: 9622: 9618: 9609: 9607: 9554:Matiz, Sergio; 9552: 9548: 9532: 9531: 9513: 9509: 9493: 9492: 9486: 9484: 9480: 9465: 9459: 9455: 9434: 9430: 9421: 9419: 9404: 9368: 9364: 9349: 9317: 9313: 9304: 9302: 9298: 9283: 9277: 9273: 9261: 9259: 9255: 9248: 9236: 9232: 9203: 9199: 9191: 9189: 9171: 9167: 9130: 9126: 9119: 9083: 9079: 9026: 9022: 9005: 9001: 8994: 8976: 8972: 8952: 8948: 8941: 8923: 8916: 8909: 8883: 8879: 8840: 8836: 8819: 8815: 8798: 8794: 8785: 8783: 8779: 8768: 8762: 8755: 8746: 8744: 8731: 8730: 8726: 8718: 8704: 8697: 8688: 8686: 8659: 8646: 8637: 8635: 8616: 8612: 8571: 8567: 8558: 8556: 8541: 8510: 8506: 8497: 8495: 8491: 8477: 8444: 8438: 8434: 8425: 8423: 8419: 8397:10.1145/3065386 8380: 8374: 8367: 8357: 8355: 8344: 8340: 8331: 8329: 8316: 8315: 8311: 8262: 8258: 8248: 8246: 8242: 8228: 8205: 8199: 8195: 8186: 8184: 8180: 8169: 8163: 8159: 8150: 8148: 8144: 8133: 8127: 8123: 8074: 8070: 8061: 8059: 8044: 8040: 8031: 8029: 8006: 8002: 7971: 7967: 7958: 7956: 7952: 7938: 7919: 7911: 7907: 7902:Wayback Machine 7888: 7884: 7878:Wayback Machine 7868: 7864: 7858:Wayback Machine 7848: 7844: 7835: 7833: 7818: 7814: 7787: 7783: 7775: 7769: 7750: 7742: 7735: 7726: 7724: 7689:Medical Physics 7681: 7674: 7665: 7663: 7620: 7613: 7607:Wayback Machine 7597: 7590: 7584:Wayback Machine 7574: 7570: 7561: 7559: 7555: 7548: 7540: 7536: 7529:Wayback Machine 7518: 7514: 7505: 7503: 7488: 7484: 7477:Wayback Machine 7463: 7459: 7451: 7443: 7436: 7427: 7425: 7386:(11): 1527–54. 7376:"Deep Learning" 7372: 7365: 7350: 7324: 7320: 7303: 7299: 7279: 7272: 7249: 7242: 7194: 7192:"Deep learning" 7188: 7184: 7175: 7173: 7166: 7150: 7146: 7101: 7094: 7039: 7035: 7000: 6996: 6976: 6972: 6965: 6939: 6935: 6925: 6923: 6912: 6908: 6891: 6887: 6870: 6866: 6849: 6845: 6822: 6803:10.1.1.300.3283 6776: 6765: 6756: 6754: 6741: 6734: 6724: 6722: 6718: 6711: 6702: 6698: 6688: 6686: 6682: 6667: 6661: 6652: 6635: 6631: 6621: 6619: 6606: 6605: 6601: 6593: 6589: 6579: 6577: 6573: 6547:Neural Networks 6542: 6536: 6529: 6519: 6517: 6513: 6474: 6468: 6461: 6408: 6401: 6364: 6355: 6314:(8): e0220113. 6300: 6296: 6287: 6285: 6270: 6234: 6219: 6210: 6208: 6165: 6156: 6147: 6145: 6130: 6121: 6106: 6083: 6076: 6067: 6065: 6061: 6039: 6020: 6008: 6004: 5989: 5967: 5963: 5954: 5952: 5948: 5941: 5933: 5929: 5886: 5882: 5870: 5868: 5864: 5849: 5843: 5834: 5814: 5813: 5801: 5787: 5780: 5767: 5765: 5730: 5726: 5717: 5715: 5672: 5668: 5659: 5657: 5650: 5634: 5627: 5618: 5616: 5609: 5593: 5586: 5539:"Deep learning" 5535: 5531: 5527: 5522: 5521: 5512: 5508: 5503: 5499: 5482: 5478: 5473: 5426: 5401:implementation. 5316: 5301: 5295: 5272: 5270:Deep Q-networks 5267: 5251:visual salience 5239:computer vision 5235: 5219: 5187: 5178: 5137: 5108: 5064: 5056: 5036: 5007: 4938: 4933: 4908: 4887: 4884: 4883: 4860: 4856: 4845: 4844: 4839: 4836: 4835: 4813: 4812: 4810: 4807: 4806: 4799: 4771: 4758: 4749: 4743: 4738: 4729: 4723: 4721:Artificial data 4685: 4654: 4651: 4650: 4647: 4618: 4614: 4612: 4609: 4608: 4588: 4585: 4584: 4568: 4565: 4564: 4547: 4543: 4541: 4538: 4537: 4518: 4515: 4514: 4495: 4492: 4491: 4469: 4466: 4465: 4458: 4453: 4430: 4419: 4413: 4410: 4399: 4387: 4376: 4370: 4351: 4318: 4309: 4283: 4260: 4248: 4240: 4232: 4223: 4215: 4207:hyperparameters 4199: 4188: 4182: 4179: 4168: 4156: 4145: 4143:Hyperparameters 4109: 4106: 4105: 4065: 4062: 4061: 4020: 4012:Main articles: 4010: 4002:vector addition 3985: 3956: 3952: 3943: 3939: 3913: 3910: 3909: 3890: 3870: 3853: 3850: 3849: 3806: 3803: 3802: 3738: 3735: 3734: 3720: 3692: 3689: 3688: 3594: 3592: 3589: 3588: 3565: 3561: 3547: 3529: 3525: 3511: 3500: 3497: 3496: 3474: 3472: 3469: 3468: 3451: 3447: 3432: 3428: 3419: 3415: 3413: 3410: 3409: 3371: 3366: 3365: 3363: 3360: 3359: 3330: 3325: 3324: 3322: 3319: 3318: 3274: 3271: 3270: 3248: 3245: 3244: 3222: 3219: 3218: 3196: 3192: 3188: 3184: 3178: 3173: 3166: 3162: 3154: 3150: 3146: 3142: 3136: 3131: 3129: 3127: 3124: 3123: 3116:Softmax pooling 3089: 3085: 3079: 3074: 3068: 3064: 3062: 3060: 3057: 3056: 3039: 3035: 3033: 3030: 3029: 3001: 2998: 2997: 2996:is the case of 2971: 2968: 2967: 2945: 2942: 2941: 2919: 2916: 2915: 2894: 2890: 2879: 2874: 2873: 2867: 2863: 2858: 2852: 2838: 2837: 2833: 2832: 2830: 2827: 2826: 2810: 2801: 2781: 2722: 2718: 2712: 2695: 2667: 2663: 2661: 2658: 2657: 2644:of parameters, 2610: 2604: 2568: 2540: 2537: 2536: 2517: 2494: 2491: 2490: 2437: 2435: 2433: 2430: 2429: 2410: 2407: 2406: 2390: 2387: 2386: 2370: 2367: 2366: 2350: 2347: 2346: 2307: 2304: 2303: 2270: 2267: 2266: 2236:hyperparameters 2232: 2220:British English 2212:receptive field 2188: 2134: 2113: 2102: 2096: 2093: 2082: 2070: 2059: 2057:Building blocks 2051:free parameters 2047:vision problems 1988:spatially local 1950: 1927: 1866: 1849: 1829: 1810:British English 1802: 1796: 1783: 1775:computer vision 1756: 1741:backpropagation 1728: 1712: 1697:backpropagation 1647: 1611:receptive field 1603:visual cortices 1591: 1579: 1553: 1552: 1547: 1541: 1540: 1535: 1525: 1524: 1510: 1507: 1506: 1494: 1492:Deconvolutional 1470: 1454:receptive field 1449: 1447:Receptive field 1437: 1429:average pooling 1420: 1393: 1390: 1389: 1375:backpropagation 1350:receptive field 1321: 1289: 1280: 1265: 1257:hand-engineered 1242:receptive field 1176:shift invariant 1107:backpropagation 1095:computer vision 1055: 1026: 1025: 999: 991: 990: 951: 943: 942: 903:Kernel machines 898: 890: 889: 865: 857: 856: 837:Active learning 832: 824: 823: 792: 782: 781: 707:Diffusion model 643: 633: 632: 605: 595: 594: 568: 558: 557: 513:Factor analysis 508: 498: 497: 481: 444: 434: 433: 354: 353: 337: 336: 335: 324: 323: 229: 221: 220: 186:Online learning 151: 139: 122: 111: 105: 102: 59: 57: 47: 35: 24: 17: 12: 11: 5: 12759: 12749: 12748: 12743: 12738: 12721: 12720: 12718: 12717: 12716: 12715: 12710: 12697: 12696: 12695: 12690: 12676: 12673: 12672: 12670: 12669: 12664: 12659: 12654: 12649: 12644: 12639: 12634: 12629: 12624: 12619: 12614: 12609: 12604: 12599: 12593: 12591: 12587: 12586: 12584: 12583: 12578: 12573: 12568: 12563: 12558: 12553: 12548: 12543: 12537: 12535: 12531: 12530: 12528: 12527: 12525:Ilya Sutskever 12522: 12517: 12512: 12507: 12502: 12497: 12492: 12490:Demis Hassabis 12487: 12482: 12480:Ian Goodfellow 12477: 12472: 12466: 12464: 12460: 12459: 12456: 12455: 12453: 12452: 12447: 12446: 12445: 12435: 12430: 12425: 12420: 12415: 12410: 12405: 12399: 12397: 12393: 12392: 12390: 12389: 12384: 12379: 12374: 12369: 12364: 12359: 12354: 12349: 12344: 12339: 12334: 12329: 12324: 12319: 12314: 12309: 12308: 12307: 12297: 12292: 12287: 12282: 12277: 12271: 12269: 12265: 12264: 12262: 12261: 12256: 12255: 12254: 12249: 12239: 12238: 12237: 12232: 12227: 12217: 12212: 12207: 12202: 12197: 12192: 12187: 12182: 12177: 12171: 12169: 12162: 12158: 12157: 12155: 12154: 12149: 12144: 12139: 12134: 12129: 12124: 12118: 12116: 12112: 12111: 12109: 12108: 12103: 12098: 12093: 12088: 12082: 12080: 12076: 12075: 12073: 12072: 12071: 12070: 12063:Language model 12060: 12055: 12050: 12049: 12048: 12038: 12037: 12036: 12025: 12023: 12019: 12018: 12016: 12015: 12013:Autoregression 12010: 12005: 12004: 12003: 11993: 11991:Regularization 11988: 11987: 11986: 11981: 11976: 11966: 11961: 11956: 11954:Loss functions 11951: 11946: 11941: 11936: 11931: 11930: 11929: 11919: 11914: 11913: 11912: 11901: 11899: 11895: 11894: 11892: 11891: 11889:Inductive bias 11886: 11881: 11876: 11871: 11866: 11861: 11856: 11851: 11843: 11841: 11835: 11834: 11829: 11828: 11821: 11814: 11806: 11800: 11799: 11793: 11777: 11776:External links 11774: 11772: 11771: 11740: 11725: 11689: 11658: 11621:(3): 403–418. 11605: 11554: 11533: 11449: 11426: 11390: 11361: 11341: 11324: 11309: 11295: 11269: 11255:(2019-06-07), 11244: 11223: 11212:(1): 162–169. 11196: 11175: 11154: 11133: 11112: 11091: 11065: 11044: 11023: 11017:978-1558607835 11016: 10995: 10984:(4): 422–428. 10968: 10949:(6): 1382–91. 10933: 10919:. 2015-05-27. 10904: 10875: 10854: 10833: 10792: 10767: 10732: 10711: 10690: 10669: 10649: 10628: 10607: 10577: 10556: 10541: 10505: 10487: 10467: 10460: 10434: 10350: 10328: 10308: 10287: 10242:(1): 221–231. 10226: 10219: 10183: 10153: 10121: 10114: 10078: 10052: 9987: 9957:10.1.1.92.5813 9934: 9901: 9882: 9873: 9862:(3): 231–250. 9846: 9817: 9796: 9763: 9742: 9713: 9673: 9636:(2): 371–380. 9616: 9558:(2019-06-01). 9546: 9507: 9453: 9428: 9402: 9362: 9347: 9311: 9271: 9230: 9197: 9165: 9124: 9117: 9077: 9020: 8999: 8992: 8970: 8946: 8939: 8914: 8907: 8877: 8834: 8813: 8792: 8753: 8724: 8716: 8695: 8644: 8610: 8581:(1): 197–227. 8565: 8539: 8504: 8475: 8432: 8365: 8338: 8309: 8256: 8226: 8193: 8157: 8121: 8091:10.1.1.76.1541 8084:(7): 1527–54. 8068: 8038: 8000: 7965: 7936: 7928:10.1007/b11963 7905: 7882: 7862: 7842: 7812: 7781: 7767: 7733: 7672: 7634:(29): 4211–7. 7628:Applied Optics 7611: 7588: 7568: 7534: 7512: 7482: 7457: 7434: 7393:10.1.1.76.1541 7363: 7348: 7318: 7297: 7270: 7259:(4): 322–333. 7240: 7182: 7164: 7144: 7092: 7033: 6994: 6970: 6963: 6933: 6906: 6885: 6864: 6843: 6820: 6763: 6732: 6696: 6650: 6629: 6599: 6587: 6553:(5): 555–559. 6527: 6485:(4): 193–202. 6459: 6422:(1): 215–243. 6399: 6368:"Neocognitron" 6353: 6294: 6268: 6217: 6179:(32): 4790–7. 6173:Applied Optics 6154: 6119: 6104: 6074: 6037: 6002: 5987: 5961: 5927: 5880: 5832: 5799: 5778: 5738:Neurocomputing 5724: 5666: 5648: 5625: 5607: 5584: 5528: 5526: 5523: 5520: 5519: 5506: 5497: 5475: 5474: 5472: 5469: 5468: 5467: 5462: 5457: 5452: 5447: 5442: 5437: 5432: 5425: 5422: 5421: 5420: 5402: 5383: 5369: 5359: 5353: 5338:Deeplearning4j 5335: 5315: 5312: 5297:Main article: 5294: 5291: 5271: 5268: 5266: 5263: 5234: 5231: 5218: 5215: 5186: 5183: 5177: 5174: 5136: 5133: 5107: 5104: 5068:drug discovery 5063: 5062:Drug discovery 5060: 5055: 5052: 5035: 5032: 5006: 5005:Video analysis 5003: 4950:MNIST database 4937: 4934: 4932: 4929: 4907: 4904: 4891: 4871: 4868: 4863: 4859: 4852: 4849: 4843: 4820: 4817: 4798: 4795: 4770: 4767: 4757: 4754: 4747:Early stopping 4745:Main article: 4742: 4741:Early stopping 4739: 4737: 4734: 4725:Main article: 4722: 4719: 4715:MNIST data set 4684: 4681: 4664: 4661: 4658: 4646: 4643: 4621: 4617: 4605:expected value 4592: 4572: 4550: 4546: 4522: 4499: 4479: 4476: 4473: 4457: 4454: 4452: 4449: 4443:or to prevent 4437:Regularization 4432: 4431: 4390: 4388: 4381: 4372:Main article: 4369: 4366: 4350: 4347: 4317: 4314: 4308: 4305: 4282: 4279: 4259: 4256: 4246: 4239: 4236: 4231: 4228: 4222: 4219: 4214: 4211: 4201: 4200: 4159: 4157: 4150: 4144: 4141: 4128: 4125: 4122: 4119: 4116: 4113: 4081: 4078: 4075: 4072: 4069: 4036:loss functions 4009: 4006: 3984: 3981: 3977:generalization 3962: 3959: 3955: 3949: 3946: 3942: 3938: 3935: 3932: 3929: 3926: 3923: 3920: 3917: 3893: 3889: 3886: 3883: 3880: 3877: 3873: 3869: 3866: 3863: 3860: 3857: 3837: 3834: 3831: 3828: 3825: 3822: 3819: 3816: 3813: 3810: 3772: 3769: 3766: 3763: 3760: 3757: 3754: 3751: 3748: 3745: 3742: 3726:introduced by 3719: 3716: 3696: 3676: 3673: 3670: 3667: 3664: 3661: 3658: 3654: 3651: 3648: 3645: 3642: 3639: 3636: 3633: 3630: 3627: 3624: 3621: 3618: 3615: 3612: 3609: 3606: 3603: 3600: 3597: 3576: 3573: 3568: 3564: 3560: 3556: 3553: 3550: 3546: 3543: 3540: 3537: 3532: 3528: 3524: 3520: 3517: 3514: 3510: 3507: 3504: 3483: 3480: 3477: 3454: 3450: 3446: 3443: 3440: 3435: 3431: 3427: 3422: 3418: 3374: 3369: 3345: 3342: 3339: 3336: 3333: 3328: 3284: 3281: 3278: 3258: 3255: 3252: 3232: 3229: 3226: 3199: 3195: 3191: 3187: 3181: 3177: 3169: 3165: 3157: 3153: 3149: 3145: 3139: 3135: 3110:in expectation 3092: 3088: 3082: 3078: 3071: 3067: 3042: 3038: 3011: 3008: 3005: 2981: 2978: 2975: 2955: 2952: 2949: 2929: 2926: 2923: 2901: 2897: 2893: 2888: 2882: 2877: 2870: 2866: 2861: 2855: 2851: 2845: 2842: 2836: 2809: 2808:Other poolings 2806: 2800: 2797: 2779: 2754: 2749: 2746: 2743: 2740: 2737: 2734: 2731: 2728: 2725: 2721: 2715: 2710: 2707: 2704: 2701: 2698: 2694: 2690: 2687: 2684: 2681: 2676: 2673: 2670: 2666: 2606:Main article: 2603: 2600: 2588:activation map 2567: 2564: 2550: 2547: 2544: 2524: 2520: 2516: 2513: 2510: 2507: 2504: 2501: 2498: 2479: 2478: 2467: 2464: 2459: 2455: 2452: 2449: 2446: 2443: 2440: 2414: 2394: 2374: 2354: 2334: 2333: 2329: 2317: 2314: 2311: 2283: 2280: 2277: 2274: 2254: 2231: 2228: 2216:local in space 2208:hyperparameter 2187: 2184: 2166:activation map 2133: 2130: 2115: 2114: 2073: 2071: 2064: 2058: 2055: 2043: 2042: 2035:pooling layers 2031: 2019: 2006: 1949: 1946: 1935:Intel Xeon Phi 1926: 1923: 1906:Subsequently, 1865: 1862: 1848: 1845: 1828: 1825: 1798:Main article: 1795: 1792: 1782: 1779: 1755: 1752: 1727: 1724: 1711: 1708: 1666: 1665: 1662: 1646: 1643: 1639: 1638: 1628: 1590: 1587: 1578: 1575: 1557: 1551: 1548: 1546: 1543: 1542: 1539: 1536: 1534: 1531: 1530: 1528: 1523: 1520: 1517: 1514: 1493: 1490: 1469: 1466: 1448: 1445: 1436: 1433: 1419: 1418:Pooling layers 1416: 1403: 1400: 1397: 1320: 1317: 1313:matched filter 1309:pooling layers 1285:Main article: 1264: 1261: 1172: 1171: 1164: 1158: 1152: 1146: 1140: 1134: 1128: 1081:by itself via 1057: 1056: 1054: 1053: 1046: 1039: 1031: 1028: 1027: 1024: 1023: 1018: 1017: 1016: 1006: 1000: 997: 996: 993: 992: 989: 988: 983: 978: 973: 968: 963: 958: 952: 949: 948: 945: 944: 941: 940: 935: 930: 925: 923:Occam learning 920: 915: 910: 905: 899: 896: 895: 892: 891: 888: 887: 882: 880:Learning curve 877: 872: 866: 863: 862: 859: 858: 855: 854: 849: 844: 839: 833: 830: 829: 826: 825: 822: 821: 820: 819: 809: 804: 799: 793: 788: 787: 784: 783: 780: 779: 773: 768: 763: 758: 757: 756: 746: 741: 740: 739: 734: 729: 724: 714: 709: 704: 699: 698: 697: 687: 686: 685: 680: 675: 670: 660: 655: 650: 644: 639: 638: 635: 634: 631: 630: 625: 620: 612: 606: 601: 600: 597: 596: 593: 592: 591: 590: 585: 580: 569: 564: 563: 560: 559: 556: 555: 550: 545: 540: 535: 530: 525: 520: 515: 509: 504: 503: 500: 499: 496: 495: 490: 485: 479: 474: 469: 461: 456: 451: 445: 440: 439: 436: 435: 432: 431: 426: 421: 416: 411: 406: 401: 396: 388: 387: 386: 381: 376: 366: 364:Decision trees 361: 355: 341:classification 331: 330: 329: 326: 325: 322: 321: 316: 311: 306: 301: 296: 291: 286: 281: 276: 271: 266: 261: 256: 251: 246: 241: 236: 234:Classification 230: 227: 226: 223: 222: 219: 218: 213: 208: 203: 198: 193: 191:Batch learning 188: 183: 178: 173: 168: 163: 158: 152: 149: 148: 145: 144: 133: 132: 124: 123: 38: 36: 29: 15: 9: 6: 4: 3: 2: 12758: 12747: 12744: 12742: 12739: 12737: 12734: 12733: 12731: 12714: 12711: 12709: 12706: 12705: 12698: 12694: 12691: 12689: 12686: 12685: 12682: 12678: 12677: 12674: 12668: 12665: 12663: 12660: 12658: 12655: 12653: 12650: 12648: 12645: 12643: 12640: 12638: 12635: 12633: 12630: 12628: 12625: 12623: 12620: 12618: 12615: 12613: 12610: 12608: 12605: 12603: 12600: 12598: 12595: 12594: 12592: 12590:Architectures 12588: 12582: 12579: 12577: 12574: 12572: 12569: 12567: 12564: 12562: 12559: 12557: 12554: 12552: 12549: 12547: 12544: 12542: 12539: 12538: 12536: 12534:Organizations 12532: 12526: 12523: 12521: 12518: 12516: 12513: 12511: 12508: 12506: 12503: 12501: 12498: 12496: 12493: 12491: 12488: 12486: 12483: 12481: 12478: 12476: 12473: 12471: 12470:Yoshua Bengio 12468: 12467: 12465: 12461: 12451: 12450:Robot control 12448: 12444: 12441: 12440: 12439: 12436: 12434: 12431: 12429: 12426: 12424: 12421: 12419: 12416: 12414: 12411: 12409: 12406: 12404: 12401: 12400: 12398: 12394: 12388: 12385: 12383: 12380: 12378: 12375: 12373: 12370: 12368: 12367:Chinchilla AI 12365: 12363: 12360: 12358: 12355: 12353: 12350: 12348: 12345: 12343: 12340: 12338: 12335: 12333: 12330: 12328: 12325: 12323: 12320: 12318: 12315: 12313: 12310: 12306: 12303: 12302: 12301: 12298: 12296: 12293: 12291: 12288: 12286: 12283: 12281: 12278: 12276: 12273: 12272: 12270: 12266: 12260: 12257: 12253: 12250: 12248: 12245: 12244: 12243: 12240: 12236: 12233: 12231: 12228: 12226: 12223: 12222: 12221: 12218: 12216: 12213: 12211: 12208: 12206: 12203: 12201: 12198: 12196: 12193: 12191: 12188: 12186: 12183: 12181: 12178: 12176: 12173: 12172: 12170: 12166: 12163: 12159: 12153: 12150: 12148: 12145: 12143: 12140: 12138: 12135: 12133: 12130: 12128: 12125: 12123: 12120: 12119: 12117: 12113: 12107: 12104: 12102: 12099: 12097: 12094: 12092: 12089: 12087: 12084: 12083: 12081: 12077: 12069: 12066: 12065: 12064: 12061: 12059: 12056: 12054: 12051: 12047: 12046:Deep learning 12044: 12043: 12042: 12039: 12035: 12032: 12031: 12030: 12027: 12026: 12024: 12020: 12014: 12011: 12009: 12006: 12002: 11999: 11998: 11997: 11994: 11992: 11989: 11985: 11982: 11980: 11977: 11975: 11972: 11971: 11970: 11967: 11965: 11962: 11960: 11957: 11955: 11952: 11950: 11947: 11945: 11942: 11940: 11937: 11935: 11934:Hallucination 11932: 11928: 11925: 11924: 11923: 11920: 11918: 11915: 11911: 11908: 11907: 11906: 11903: 11902: 11900: 11896: 11890: 11887: 11885: 11882: 11880: 11877: 11875: 11872: 11870: 11867: 11865: 11862: 11860: 11857: 11855: 11852: 11850: 11849: 11845: 11844: 11842: 11840: 11836: 11827: 11822: 11820: 11815: 11813: 11808: 11807: 11804: 11797: 11794: 11791: 11787: 11783: 11780: 11779: 11759: 11755: 11751: 11744: 11736: 11732: 11728: 11726:9781605585161 11722: 11718: 11714: 11709: 11704: 11700: 11693: 11675: 11668: 11662: 11654: 11650: 11646: 11642: 11638: 11634: 11629: 11628:10.1.1.11.226 11624: 11620: 11616: 11609: 11601: 11597: 11593: 11589: 11585: 11581: 11577: 11573: 11569: 11565: 11558: 11549: 11544: 11537: 11519: 11515: 11511: 11506: 11501: 11497: 11493: 11488: 11483: 11479: 11475: 11471: 11467: 11460: 11453: 11445: 11441: 11437: 11433: 11429: 11423: 11419: 11415: 11410: 11405: 11401: 11394: 11380:on 2019-09-07 11379: 11375: 11371: 11365: 11358: 11354: 11351: 11345: 11339: 11335: 11328: 11320: 11313: 11306: 11302: 11298: 11292: 11288: 11284: 11280: 11273: 11266: 11262: 11258: 11254: 11248: 11239: 11234: 11227: 11219: 11215: 11211: 11207: 11200: 11191: 11186: 11179: 11170: 11165: 11158: 11149: 11144: 11137: 11128: 11123: 11116: 11107: 11102: 11095: 11079: 11075: 11069: 11060: 11055: 11048: 11039: 11034: 11027: 11019: 11013: 11009: 11005: 10999: 10991: 10987: 10983: 10979: 10972: 10964: 10960: 10956: 10952: 10948: 10944: 10937: 10922: 10918: 10914: 10908: 10893: 10889: 10885: 10879: 10870: 10865: 10858: 10849: 10844: 10837: 10829: 10825: 10821: 10817: 10812: 10807: 10803: 10796: 10787: 10782: 10778: 10771: 10763: 10759: 10755: 10751: 10747: 10743: 10736: 10727: 10722: 10715: 10706: 10701: 10694: 10685: 10680: 10673: 10666: 10662: 10659: 10653: 10644: 10639: 10632: 10623: 10618: 10611: 10596: 10592: 10588: 10581: 10572: 10567: 10560: 10552: 10548: 10544: 10538: 10534: 10530: 10525: 10520: 10516: 10509: 10494: 10490: 10484: 10480: 10479: 10471: 10463: 10457: 10453: 10449: 10445: 10438: 10420: 10416: 10412: 10407: 10402: 10398: 10394: 10389: 10384: 10380: 10376: 10372: 10368: 10361: 10354: 10344: 10339: 10332: 10325: 10321: 10318: 10312: 10303: 10298: 10291: 10283: 10279: 10275: 10271: 10267: 10263: 10259: 10255: 10250: 10245: 10241: 10237: 10230: 10222: 10216: 10212: 10208: 10203: 10198: 10194: 10187: 10171: 10167: 10163: 10157: 10148: 10143: 10139: 10135: 10131: 10125: 10117: 10111: 10107: 10103: 10098: 10093: 10089: 10082: 10066: 10062: 10056: 10037: 10033: 10029: 10025: 10021: 10017: 10013: 10009: 10005: 9998: 9991: 9983: 9979: 9975: 9971: 9967: 9963: 9958: 9953: 9950:(1): 98–113. 9949: 9945: 9938: 9922: 9918: 9917: 9912: 9905: 9899: 9895: 9892: 9886: 9877: 9869: 9865: 9861: 9857: 9850: 9835: 9831: 9827: 9821: 9812: 9807: 9800: 9785: 9781: 9777: 9770: 9768: 9758: 9753: 9746: 9731: 9727: 9723: 9717: 9699: 9695: 9691: 9684: 9677: 9669: 9665: 9661: 9657: 9653: 9649: 9644: 9639: 9635: 9631: 9627: 9620: 9605: 9601: 9597: 9593: 9589: 9585: 9581: 9577: 9573: 9569: 9565: 9561: 9557: 9550: 9542: 9536: 9528: 9524: 9520: 9519: 9511: 9503: 9497: 9479: 9475: 9471: 9464: 9457: 9449: 9445: 9441: 9440: 9432: 9417: 9413: 9409: 9405: 9399: 9395: 9391: 9386: 9381: 9377: 9373: 9366: 9358: 9354: 9350: 9344: 9340: 9336: 9331: 9326: 9322: 9315: 9297: 9294:: 1097–1105. 9293: 9289: 9282: 9275: 9268: 9258:on 2016-12-13 9254: 9247: 9246: 9241: 9240:Yoshua Bengio 9234: 9225: 9220: 9216: 9212: 9208: 9201: 9187: 9182: 9178: 9177: 9169: 9161: 9157: 9152: 9147: 9143: 9139: 9135: 9128: 9120: 9114: 9110: 9106: 9101: 9096: 9092: 9088: 9081: 9073: 9069: 9065: 9061: 9057: 9053: 9048: 9043: 9039: 9035: 9031: 9024: 9015: 9010: 9003: 8995: 8989: 8985: 8981: 8974: 8966: 8961: 8957: 8950: 8942: 8936: 8932: 8928: 8921: 8919: 8910: 8904: 8900: 8896: 8892: 8888: 8881: 8873: 8869: 8865: 8861: 8857: 8853: 8849: 8845: 8838: 8829: 8824: 8817: 8808: 8803: 8796: 8778: 8774: 8767: 8760: 8758: 8742: 8738: 8734: 8728: 8719: 8713: 8709: 8702: 8700: 8684: 8680: 8676: 8673:(184): 1–25. 8672: 8668: 8664: 8657: 8655: 8653: 8651: 8649: 8633: 8630:: 1097–1105. 8629: 8625: 8621: 8614: 8606: 8602: 8598: 8594: 8589: 8584: 8580: 8576: 8569: 8554: 8550: 8546: 8542: 8536: 8532: 8528: 8524: 8520: 8515: 8508: 8490: 8486: 8482: 8478: 8472: 8468: 8464: 8459: 8454: 8450: 8443: 8436: 8418: 8414: 8410: 8406: 8402: 8398: 8394: 8390: 8386: 8379: 8372: 8370: 8353: 8349: 8342: 8327: 8323: 8319: 8313: 8305: 8301: 8297: 8293: 8289: 8285: 8280: 8275: 8271: 8267: 8260: 8241: 8237: 8233: 8229: 8227:9781605585161 8223: 8219: 8215: 8211: 8204: 8197: 8179: 8175: 8168: 8161: 8143: 8139: 8132: 8125: 8117: 8113: 8109: 8105: 8101: 8097: 8092: 8087: 8083: 8079: 8072: 8057: 8053: 8049: 8042: 8027: 8023: 8019: 8015: 8011: 8004: 7996: 7992: 7988: 7984: 7980: 7976: 7969: 7951: 7947: 7943: 7939: 7933: 7929: 7925: 7918: 7917: 7909: 7903: 7899: 7896: 7892: 7886: 7879: 7875: 7872: 7866: 7859: 7855: 7852: 7846: 7831: 7827: 7823: 7816: 7808: 7804: 7800: 7796: 7792: 7785: 7774: 7770: 7764: 7760: 7756: 7749: 7748: 7740: 7738: 7722: 7718: 7714: 7710: 7706: 7702: 7698: 7695:(4): 517–24. 7694: 7690: 7686: 7679: 7677: 7661: 7657: 7653: 7649: 7645: 7641: 7637: 7633: 7629: 7625: 7618: 7616: 7608: 7604: 7601: 7595: 7593: 7585: 7581: 7578: 7572: 7554: 7547: 7546: 7538: 7531: 7530: 7526: 7523: 7516: 7501: 7497: 7493: 7486: 7479: 7478: 7474: 7471: 7466: 7461: 7450: 7449: 7441: 7439: 7423: 7419: 7415: 7411: 7407: 7403: 7399: 7394: 7389: 7385: 7381: 7377: 7370: 7368: 7359: 7355: 7351: 7349:0-8186-3870-2 7345: 7341: 7337: 7333: 7329: 7322: 7313: 7308: 7301: 7292: 7287: 7283: 7277: 7275: 7266: 7262: 7258: 7254: 7247: 7245: 7236: 7232: 7228: 7224: 7220: 7216: 7212: 7208: 7204: 7200: 7193: 7186: 7171: 7167: 7161: 7157: 7156: 7148: 7140: 7136: 7131: 7126: 7122: 7118: 7115:(3): 574–91. 7114: 7110: 7106: 7099: 7097: 7088: 7084: 7079: 7074: 7070: 7066: 7061: 7056: 7052: 7048: 7044: 7037: 7029: 7025: 7021: 7017: 7013: 7009: 7005: 6998: 6990: 6985: 6981: 6974: 6966: 6960: 6956: 6952: 6948: 6944: 6937: 6921: 6917: 6914:LeCun, Yann. 6910: 6901: 6896: 6889: 6880: 6875: 6868: 6859: 6854: 6847: 6839: 6835: 6831: 6827: 6823: 6817: 6813: 6809: 6804: 6799: 6794: 6789: 6785: 6781: 6774: 6772: 6770: 6768: 6753:on 2021-03-07 6752: 6748: 6747: 6739: 6737: 6717: 6710: 6706: 6700: 6681: 6678:: 1237–1242. 6677: 6673: 6666: 6659: 6657: 6655: 6645: 6640: 6633: 6617: 6613: 6609: 6603: 6597: 6591: 6572: 6568: 6564: 6560: 6556: 6552: 6548: 6541: 6534: 6532: 6512: 6508: 6504: 6500: 6496: 6492: 6488: 6484: 6480: 6473: 6466: 6464: 6455: 6451: 6446: 6441: 6437: 6433: 6429: 6425: 6421: 6417: 6413: 6406: 6404: 6394: 6389: 6385: 6381: 6377: 6373: 6369: 6362: 6360: 6358: 6349: 6345: 6340: 6335: 6330: 6325: 6321: 6317: 6313: 6309: 6305: 6298: 6283: 6279: 6275: 6271: 6265: 6261: 6257: 6252: 6247: 6243: 6239: 6232: 6230: 6228: 6226: 6224: 6222: 6206: 6202: 6198: 6194: 6190: 6186: 6182: 6178: 6174: 6170: 6163: 6161: 6159: 6143: 6139: 6135: 6128: 6126: 6124: 6115: 6111: 6107: 6101: 6097: 6093: 6089: 6081: 6079: 6060: 6056: 6052: 6048: 6044: 6040: 6034: 6030: 6026: 6019: 6018: 6013: 6006: 5998: 5994: 5990: 5984: 5980: 5976: 5972: 5965: 5947: 5940: 5939: 5931: 5924: 5920: 5916: 5912: 5908: 5904: 5900: 5896: 5892: 5884: 5877: 5863: 5859: 5855: 5848: 5841: 5839: 5837: 5828: 5824: 5818: 5810: 5806: 5802: 5800:9783319575490 5796: 5792: 5785: 5783: 5775: 5763: 5759: 5755: 5751: 5747: 5743: 5739: 5735: 5728: 5713: 5709: 5705: 5701: 5697: 5693: 5689: 5685: 5681: 5677: 5670: 5655: 5651: 5645: 5641: 5640: 5632: 5630: 5614: 5610: 5604: 5601:. CRC Press. 5600: 5599: 5591: 5589: 5580: 5576: 5572: 5568: 5564: 5560: 5556: 5552: 5548: 5544: 5540: 5533: 5529: 5516: 5510: 5501: 5494: 5490: 5486: 5480: 5476: 5466: 5463: 5461: 5458: 5456: 5453: 5451: 5448: 5446: 5443: 5441: 5440:Deep learning 5438: 5436: 5433: 5431: 5428: 5427: 5418: 5414: 5410: 5406: 5403: 5400: 5396: 5391: 5387: 5384: 5381: 5377: 5373: 5370: 5367: 5363: 5360: 5357: 5354: 5351: 5347: 5343: 5339: 5336: 5333: 5329: 5325: 5321: 5318: 5317: 5311: 5309: 5305: 5300: 5290: 5288: 5283: 5281: 5277: 5262: 5260: 5256: 5252: 5248: 5244: 5240: 5230: 5228: 5224: 5214: 5212: 5208: 5204: 5200: 5196: 5192: 5182: 5173: 5171: 5166: 5164: 5159: 5154: 5150: 5146: 5142: 5132: 5130: 5126: 5121: 5117: 5113: 5106:Checkers game 5103: 5101: 5097: 5093: 5089: 5085: 5081: 5077: 5073: 5069: 5059: 5051: 5049: 5045: 5041: 5031: 5029: 5025: 5021: 5017: 5013: 5002: 4998: 4995: 4991: 4987: 4982: 4977: 4975: 4971: 4970:video quality 4967: 4962: 4960: 4956: 4951: 4947: 4943: 4928: 4926: 4925:visual system 4920: 4918: 4912: 4903: 4889: 4869: 4866: 4861: 4847: 4815: 4804: 4794: 4792: 4786: 4782: 4780: 4776: 4766: 4764: 4753: 4748: 4733: 4728: 4718: 4716: 4712: 4708: 4703: 4701: 4697: 4693: 4692:deterministic 4688: 4680: 4676: 4662: 4659: 4656: 4642: 4640: 4635: 4619: 4615: 4606: 4590: 4570: 4548: 4544: 4534: 4520: 4511: 4497: 4477: 4474: 4471: 4463: 4448: 4446: 4442: 4438: 4428: 4425: 4417: 4407: 4403: 4397: 4396: 4391:This section 4389: 4385: 4380: 4379: 4375: 4365: 4363: 4359: 4357: 4346: 4344: 4340: 4336: 4335:anti-aliasing 4331: 4327: 4323: 4313: 4304: 4302: 4298: 4293: 4291: 4287: 4278: 4276: 4271: 4269: 4265: 4255: 4252: 4249: 4235: 4227: 4218: 4210: 4208: 4197: 4194: 4186: 4176: 4172: 4166: 4165: 4160:This section 4158: 4154: 4149: 4148: 4140: 4120: 4114: 4103: 4099: 4095: 4076: 4073: 4070: 4059: 4055: 4054:cross-entropy 4052: 4048: 4044: 4039: 4037: 4033: 4029: 4025: 4024:loss function 4019: 4015: 4014:Loss function 4005: 4003: 3999: 3995: 3991: 3980: 3978: 3960: 3957: 3947: 3944: 3940: 3936: 3933: 3927: 3921: 3915: 3908: 3884: 3878: 3875: 3867: 3861: 3855: 3832: 3826: 3823: 3820: 3814: 3808: 3801: 3796: 3794: 3793:Yoshua Bengio 3790: 3786: 3767: 3764: 3761: 3752: 3746: 3740: 3733: 3729: 3725: 3715: 3712: 3710: 3694: 3671: 3668: 3665: 3662: 3659: 3566: 3562: 3544: 3541: 3538: 3530: 3526: 3505: 3502: 3452: 3448: 3444: 3441: 3438: 3433: 3429: 3425: 3420: 3416: 3407: 3403: 3399: 3397: 3393: 3389: 3372: 3343: 3340: 3337: 3334: 3331: 3316: 3312: 3310: 3306: 3302: 3300: 3295: 3276: 3256: 3250: 3230: 3227: 3224: 3197: 3193: 3189: 3185: 3179: 3175: 3167: 3163: 3155: 3151: 3147: 3143: 3137: 3133: 3121: 3117: 3113: 3111: 3090: 3086: 3080: 3076: 3069: 3065: 3040: 3036: 3027: 3023: 3009: 3006: 3003: 2995: 2973: 2953: 2950: 2947: 2927: 2924: 2921: 2899: 2895: 2891: 2886: 2880: 2868: 2864: 2853: 2849: 2843: 2840: 2834: 2824: 2820: 2816: 2814: 2813:Mixed Pooling 2805: 2792: 2788: 2785: 2783: 2775: 2770: 2768: 2767:max operation 2752: 2747: 2744: 2741: 2738: 2735: 2732: 2729: 2726: 2723: 2719: 2713: 2708: 2705: 2702: 2699: 2696: 2688: 2682: 2674: 2671: 2668: 2664: 2655: 2651: 2647: 2641: 2639: 2635: 2631: 2630:down-sampling 2622: 2614: 2609: 2608:Pooling layer 2602:Pooling layer 2599: 2595: 2593: 2589: 2585: 2581: 2576: 2574: 2563: 2548: 2545: 2542: 2522: 2518: 2511: 2508: 2505: 2499: 2496: 2488: 2484: 2465: 2462: 2457: 2453: 2450: 2447: 2444: 2441: 2438: 2428: 2427: 2426: 2412: 2392: 2372: 2352: 2338: 2330: 2315: 2312: 2309: 2301: 2297: 2281: 2278: 2275: 2272: 2264: 2260: 2259: 2255: 2251: 2250: 2245: 2244: 2243: 2241: 2237: 2227: 2221: 2217: 2213: 2209: 2204: 2202: 2192: 2183: 2181: 2177: 2173: 2171: 2167: 2163: 2159: 2155: 2151: 2143: 2138: 2125: 2121: 2111: 2108: 2100: 2090: 2086: 2080: 2079: 2074:This section 2072: 2068: 2063: 2062: 2054: 2052: 2048: 2040: 2036: 2032: 2029: 2025: 2020: 2016: 2012: 2007: 2004: 2000: 1999: 1998: 1996: 1995:visual cortex 1991: 1989: 1985: 1980: 1978: 1969: 1965: 1963: 1959: 1955: 1945: 1943: 1939: 1936: 1932: 1922: 1919: 1917: 1913: 1909: 1904: 1901: 1896: 1894: 1888: 1886: 1882: 1878: 1873: 1871: 1861: 1853: 1844: 1842: 1837: 1835: 1824: 1822: 1817: 1811: 1807: 1801: 1791: 1789: 1778: 1776: 1771: 1768: 1763: 1761: 1751: 1748: 1744: 1742: 1737: 1733: 1723: 1721: 1717: 1707: 1705: 1700: 1698: 1694: 1690: 1685: 1681: 1679: 1675: 1671: 1663: 1660: 1659: 1658: 1656: 1652: 1642: 1636: 1632: 1631:complex cells 1629: 1626: 1623: 1622: 1621: 1618: 1616: 1612: 1608: 1604: 1600: 1596: 1586: 1584: 1574: 1571: 1555: 1549: 1544: 1537: 1532: 1526: 1515: 1503: 1499: 1496: 1489: 1487: 1483: 1479: 1474: 1465: 1461: 1459: 1455: 1444: 1442: 1432: 1430: 1426: 1415: 1401: 1398: 1395: 1387: 1383: 1378: 1376: 1371: 1366: 1357: 1353: 1351: 1345: 1343: 1338: 1335: 1333: 1328: 1326: 1316: 1314: 1310: 1306: 1302: 1298: 1294: 1293:hidden layers 1288: 1278: 1274: 1269: 1260: 1258: 1254: 1250: 1245: 1243: 1240:known as the 1239: 1235: 1232:. Individual 1231: 1230:visual cortex 1227: 1223: 1219: 1214: 1211: 1207: 1203: 1199: 1195: 1193: 1189: 1185: 1181: 1177: 1169: 1165: 1162: 1159: 1156: 1153: 1150: 1147: 1144: 1141: 1138: 1135: 1132: 1129: 1126: 1123: 1122: 1121: 1118: 1116: 1112: 1108: 1104: 1100: 1096: 1092: 1091:deep learning 1088: 1087:deep learning 1084: 1080: 1076: 1072: 1068: 1064: 1052: 1047: 1045: 1040: 1038: 1033: 1032: 1030: 1029: 1022: 1019: 1015: 1012: 1011: 1010: 1007: 1005: 1002: 1001: 995: 994: 987: 984: 982: 979: 977: 974: 972: 969: 967: 964: 962: 959: 957: 954: 953: 947: 946: 939: 936: 934: 931: 929: 926: 924: 921: 919: 916: 914: 911: 909: 906: 904: 901: 900: 894: 893: 886: 883: 881: 878: 876: 873: 871: 868: 867: 861: 860: 853: 850: 848: 845: 843: 842:Crowdsourcing 840: 838: 835: 834: 828: 827: 818: 815: 814: 813: 810: 808: 805: 803: 800: 798: 795: 794: 791: 786: 785: 777: 774: 772: 771:Memtransistor 769: 767: 764: 762: 759: 755: 752: 751: 750: 747: 745: 742: 738: 735: 733: 730: 728: 725: 723: 720: 719: 718: 715: 713: 710: 708: 705: 703: 700: 696: 693: 692: 691: 688: 684: 681: 679: 676: 674: 671: 669: 666: 665: 664: 661: 659: 656: 654: 653:Deep learning 651: 649: 646: 645: 642: 637: 636: 629: 626: 624: 621: 619: 617: 613: 611: 608: 607: 604: 599: 598: 589: 588:Hidden Markov 586: 584: 581: 579: 576: 575: 574: 571: 570: 567: 562: 561: 554: 551: 549: 546: 544: 541: 539: 536: 534: 531: 529: 526: 524: 521: 519: 516: 514: 511: 510: 507: 502: 501: 494: 491: 489: 486: 484: 480: 478: 475: 473: 470: 468: 466: 462: 460: 457: 455: 452: 450: 447: 446: 443: 438: 437: 430: 427: 425: 422: 420: 417: 415: 412: 410: 407: 405: 402: 400: 397: 395: 393: 389: 385: 384:Random forest 382: 380: 377: 375: 372: 371: 370: 367: 365: 362: 360: 357: 356: 349: 348: 343: 342: 334: 328: 327: 320: 317: 315: 312: 310: 307: 305: 302: 300: 297: 295: 292: 290: 287: 285: 282: 280: 277: 275: 272: 270: 269:Data cleaning 267: 265: 262: 260: 257: 255: 252: 250: 247: 245: 242: 240: 237: 235: 232: 231: 225: 224: 217: 214: 212: 209: 207: 204: 202: 199: 197: 194: 192: 189: 187: 184: 182: 181:Meta-learning 179: 177: 174: 172: 169: 167: 164: 162: 159: 157: 154: 153: 147: 146: 143: 138: 135: 134: 130: 129: 120: 117: 109: 98: 95: 91: 88: 84: 81: 77: 74: 70: 67: –  66: 62: 61:Find sources: 55: 51: 45: 44: 39:This article 37: 33: 28: 27: 22: 12636: 12556:Hugging Face 12520:David Silver 12168:Audio–visual 12022:Applications 12001:Augmentation 11846: 11762:. Retrieved 11753: 11743: 11698: 11692: 11681:. Retrieved 11661: 11618: 11614: 11608: 11567: 11563: 11557: 11548:1508.04186v2 11536: 11525:. Retrieved 11469: 11465: 11452: 11399: 11393: 11382:. Retrieved 11378:the original 11373: 11364: 11344: 11327: 11318: 11312: 11278: 11272: 11256: 11247: 11226: 11209: 11205: 11199: 11178: 11157: 11136: 11115: 11094: 11082:. Retrieved 11078:the original 11068: 11047: 11026: 11007: 11004:Fogel, David 10998: 10981: 10977: 10971: 10946: 10942: 10936: 10925:. Retrieved 10916: 10907: 10896:. Retrieved 10887: 10878: 10857: 10836: 10801: 10795: 10776: 10770: 10745: 10741: 10735: 10714: 10693: 10672: 10652: 10631: 10610: 10599:. Retrieved 10590: 10580: 10559: 10514: 10508: 10497:. Retrieved 10477: 10470: 10443: 10437: 10426:. Retrieved 10370: 10366: 10353: 10331: 10311: 10290: 10239: 10235: 10229: 10192: 10186: 10174:. Retrieved 10165: 10156: 10137: 10124: 10087: 10081: 10069:. Retrieved 10055: 10043:. Retrieved 10007: 10003: 9990: 9947: 9943: 9937: 9925:. Retrieved 9914: 9904: 9885: 9876: 9859: 9855: 9849: 9838:. Retrieved 9829: 9820: 9799: 9788:. Retrieved 9779: 9745: 9734:. Retrieved 9725: 9716: 9705:. Retrieved 9693: 9689: 9676: 9633: 9629: 9619: 9608:. Retrieved 9567: 9563: 9549: 9517: 9510: 9496:cite journal 9485:. Retrieved 9473: 9469: 9456: 9438: 9431: 9420:. Retrieved 9375: 9365: 9320: 9314: 9303:. Retrieved 9291: 9287: 9274: 9266: 9260:. Retrieved 9253:the original 9244: 9233: 9217:(1): 69–78. 9214: 9210: 9200: 9190:, retrieved 9175: 9168: 9144:(17): 8643. 9141: 9137: 9127: 9090: 9080: 9037: 9033: 9023: 9002: 8983: 8973: 8955: 8949: 8930: 8890: 8880: 8847: 8843: 8837: 8816: 8795: 8784:. Retrieved 8772: 8745:. Retrieved 8736: 8727: 8707: 8687:. Retrieved 8670: 8666: 8636:. Retrieved 8627: 8623: 8613: 8578: 8574: 8568: 8557:. Retrieved 8522: 8517: 8507: 8496:. Retrieved 8448: 8435: 8424:. Retrieved 8391:(6): 84–90. 8388: 8384: 8356:. Retrieved 8341: 8330:. Retrieved 8321: 8312: 8269: 8265: 8259: 8247:. Retrieved 8209: 8196: 8185:. Retrieved 8173: 8160: 8149:. Retrieved 8137: 8124: 8081: 8077: 8071: 8060:. Retrieved 8054:. Suvisoft. 8051: 8041: 8030:. Retrieved 8013: 8003: 7978: 7974: 7968: 7957:. Retrieved 7915: 7908: 7885: 7865: 7845: 7834:. Retrieved 7825: 7815: 7798: 7794: 7784: 7759:10.1142/2808 7746: 7725:. Retrieved 7692: 7688: 7664:. Retrieved 7631: 7627: 7571: 7560:. Retrieved 7544: 7537: 7520: 7515: 7504:. Retrieved 7495: 7485: 7468: 7460: 7447: 7426:. Retrieved 7383: 7380:Scholarpedia 7379: 7331: 7321: 7300: 7256: 7252: 7202: 7198: 7185: 7174:. Retrieved 7154: 7147: 7112: 7108: 7050: 7046: 7036: 7011: 7007: 6997: 6979: 6973: 6946: 6936: 6924:. Retrieved 6909: 6888: 6867: 6846: 6779: 6755:. Retrieved 6751:the original 6745: 6723:. Retrieved 6699: 6687:. Retrieved 6675: 6671: 6632: 6620:. Retrieved 6616:the original 6611: 6602: 6590: 6578:. Retrieved 6550: 6546: 6518:. Retrieved 6482: 6478: 6419: 6415: 6375: 6372:Scholarpedia 6371: 6311: 6307: 6297: 6286:. Retrieved 6241: 6209:. Retrieved 6176: 6172: 6146:. Retrieved 6137: 6087: 6066:. Retrieved 6016: 6005: 5970: 5964: 5953:. Retrieved 5937: 5930: 5922: 5894: 5890: 5883: 5875: 5869:. Retrieved 5857: 5853: 5790: 5772: 5766:. Retrieved 5741: 5737: 5727: 5716:. Retrieved 5683: 5679: 5669: 5658:. Retrieved 5638: 5617:. Retrieved 5597: 5546: 5542: 5532: 5509: 5500: 5479: 5450:Neocognitron 5302: 5284: 5278:, a form of 5273: 5236: 5220: 5202: 5191:clay tablets 5188: 5179: 5167: 5138: 5119: 5109: 5092:biomolecules 5065: 5057: 5037: 5008: 4999: 4978: 4963: 4939: 4931:Applications 4921: 4913: 4909: 4800: 4787: 4783: 4772: 4769:Weight decay 4759: 4750: 4730: 4707:deformations 4704: 4689: 4686: 4677: 4648: 4636: 4535: 4512: 4459: 4435: 4420: 4411: 4400:Please help 4395:verification 4392: 4355: 4352: 4319: 4310: 4294: 4284: 4272: 4261: 4253: 4244: 4241: 4233: 4224: 4216: 4204: 4189: 4180: 4169:Please help 4164:verification 4161: 4057: 4046: 4040: 4021: 3986: 3797: 3785:nonlinearity 3721: 3713: 3401: 3400: 3391: 3390: 3314: 3313: 3297: 3296: 3115: 3114: 3025: 3024: 2993: 2818: 2817: 2812: 2811: 2802: 2786: 2771: 2642: 2633: 2627: 2596: 2577: 2572: 2569: 2480: 2344: 2299: 2295: 2257: 2256: 2248: 2247: 2233: 2205: 2199:enforcing a 2197: 2178: 2174: 2147: 2118: 2103: 2094: 2083:Please help 2078:verification 2075: 2044: 2028:equivariance 2003:3 dimensions 1992: 1981: 1974: 1951: 1928: 1920: 1905: 1897: 1889: 1874: 1867: 1858: 1838: 1830: 1818: 1803: 1784: 1772: 1764: 1757: 1749: 1745: 1729: 1713: 1704:neocognitron 1701: 1686: 1682: 1680:in general. 1667: 1651:neocognitron 1648: 1640: 1625:simple cells 1619: 1615:visual field 1607:visual field 1592: 1580: 1572: 1504: 1500: 1497: 1495: 1477: 1475: 1471: 1462: 1457: 1453: 1450: 1438: 1428: 1424: 1421: 1385: 1381: 1379: 1369: 1362: 1346: 1339: 1336: 1329: 1327:with shape: 1322: 1290: 1263:Architecture 1246: 1238:visual field 1215: 1196: 1179: 1175: 1173: 1119: 1114: 1110: 1077:that learns 1066: 1062: 1060: 928:PAC learning 716: 615: 464: 459:Hierarchical 391: 345: 339: 112: 103: 93: 86: 79: 72: 60: 48:Please help 43:verification 40: 12704:Categories 12652:Autoencoder 12607:Transformer 12475:Alex Graves 12423:OpenAI Five 12327:IBM Watsonx 11949:Convolution 11927:Overfitting 11472:(7): 1979. 11370:"NIPS 2017" 11253:Hubert Mara 10373:(5): 1657. 10045:17 November 9570:: 172–182. 8523:IEEE Xplore 8249:22 December 8140:: 153–160. 6926:16 November 6725:17 November 6689:17 November 6580:17 November 6520:16 November 6378:(1): 1717. 5744:: 439–453. 5491:, or other 5485:time domain 5435:Convolution 5223:overfitting 5217:Fine-tuning 5199:3D scanners 5141:computer Go 5096:Ebola virus 5080:aromaticity 4645:DropConnect 4445:overfitting 4322:equivariant 4290:downsampled 4286:Max pooling 4275:overfitting 4268:Inceptionv3 4258:Filter size 4213:Kernel size 4102:real-valued 2776:pooling or 2650:overfitting 2634:max pooling 2580:convolution 2573:depth slice 2571:depth as a 2263:overlapping 2210:called the 2162:dot product 2024:equivariant 1938:coprocessor 1781:Max pooling 1736:Alex Waibel 1425:Max pooling 1297:dot product 1210:overfitting 1188:equivariant 1184:convolution 1168:time series 1115:convolution 1099:transformer 1071:regularized 812:Multi-agent 749:Transformer 648:Autoencoder 404:Naive Bayes 142:data mining 12730:Categories 12693:Technology 12546:EleutherAI 12505:Fei-Fei Li 12500:Yann LeCun 12413:Q-learning 12396:Decisional 12322:IBM Watson 12230:Midjourney 12122:TensorFlow 11969:Activation 11922:Regression 11917:Clustering 11683:2017-08-18 11527:2018-09-14 11409:1803.07179 11384:2018-09-12 11238:1908.07978 11190:1906.04397 11169:1508.00317 11148:1703.04691 11127:1511.07122 11106:1803.01271 11084:30 January 10927:2015-11-09 10898:2015-11-09 10869:1506.06579 10848:1510.02855 10811:1906.03821 10786:2107.09355 10726:1803.01271 10705:1702.01923 10601:2015-12-17 10499:2022-03-31 10428:2018-09-14 10302:1801.10111 10176:27 October 10071:30 January 9840:2015-12-17 9790:2015-12-17 9736:2015-12-17 9707:2015-01-03 9610:2021-09-29 9527:1106278545 9487:2021-03-26 9448:1106340711 9422:2021-03-26 9385:2104.05997 9330:2102.07757 9305:2022-03-31 9262:2023-04-10 9192:2024-09-09 9186:2009.07485 9100:2106.04560 8786:2016-12-28 8747:2017-04-25 8689:2022-03-31 8638:2021-03-26 8588:1702.07908 8559:2022-03-31 8498:2022-03-31 8458:1512.03385 8426:2018-12-04 8358:14 January 8332:2019-01-14 8187:2014-06-26 8151:2022-03-31 8062:2016-03-14 8032:2022-03-31 7959:2016-12-28 7836:2016-09-22 7727:2016-09-22 7666:2016-09-22 7562:2019-09-04 7506:2019-12-03 7428:2019-01-20 7312:1710.05941 7291:2212.11279 7176:2019-01-18 7109:J. Physiol 7053:: 750639. 7014:(10): e3. 6989:1603.07285 6900:2108.07387 6879:1706.05587 6858:1511.07122 6757:2019-09-04 6705:Krizhevsky 6644:1610.02357 6288:2021-03-26 6251:2103.10097 6211:2016-09-22 6148:2020-06-22 6068:2023-07-21 5955:2022-03-31 5871:2022-03-31 5768:2023-08-12 5718:2023-08-12 5660:2020-12-13 5619:2020-12-13 5525:References 5513:So-called 5399:on-the-GPU 5376:Apache 2.0 5372:TensorFlow 5326:, and has 5287:Atari 2600 5276:Q-learning 5245:such as a 5084:sp carbons 4946:error rate 4349:Evaluation 4098:regressing 4008:Loss layer 3979:accuracy. 3905:, and the 3718:ReLU layer 3404:applies a 2819:Lp Pooling 2654:ReLU layer 2638:partitions 1767:Yann LeCun 1689:supervised 1222:biological 1166:financial 797:Q-learning 695:Restricted 493:Mean shift 442:Clustering 419:Perceptron 347:regression 249:Clustering 244:Regression 76:newspapers 12576:MIT CSAIL 12541:Anthropic 12510:Andrew Ng 12408:AlphaZero 12252:VideoPoet 12215:AlphaFold 12152:MindSpore 12106:SpiNNaker 12101:Memristor 12008:Diffusion 11984:Rectifier 11964:Batchnorm 11944:Attention 11939:Adversary 11703:CiteSeerX 11645:1083-4419 11623:CiteSeerX 11600:205242740 11496:1424-8220 11436:1868-4238 11305:211026941 11059:1412.6564 11038:1412.3409 10828:182952311 10762:236307579 10684:1103.0398 10643:1408.5882 10622:1404.2188 10571:1404.7296 10519:CiteSeerX 10515:CVPR 2011 10397:1424-8220 10343:1406.2199 10266:0162-8828 10244:CiteSeerX 10197:CiteSeerX 10147:1409.0575 10097:1409.4842 10032:221185563 9952:CiteSeerX 9927:5 October 9811:1207.0580 9757:1301.3557 9668:219885788 9652:2168-2208 9600:127253432 9592:0031-3203 9535:cite book 9412:233219976 9357:231925012 9160:2076-3417 9064:0162-8828 9047:1406.4729 9014:1312.4400 8965:1301.3557 8864:0018-9545 8828:1412.6806 8807:1412.6071 8722:, pp. 448 8679:1533-7928 8485:206594692 8413:195908774 8405:0001-0782 8279:1003.0358 8086:CiteSeerX 7388:CiteSeerX 7069:1662-453X 7028:2476-0757 6830:812295155 6798:CiteSeerX 6793:1202.2745 6622:31 August 6507:206775608 6436:0022-3751 6278:232269854 6055:221386616 5919:218955622 5911:0378-4754 5860:: 31–40. 5817:cite book 5809:987790957 5758:219470398 5708:213010088 5700:1941-0050 5571:1476-4687 5368:and Java. 5334:wrappers. 5211:curvature 5203:HeiCuBeDa 5125:Blondie24 5016:recurrent 4994:precision 4990:DeepDream 4986:GoogLeNet 4858:‖ 4851:→ 4842:‖ 4819:→ 4763:zero norm 4660:− 4603:, so the 4475:− 4451:Empirical 4414:June 2017 4183:June 2017 4124:∞ 4118:∞ 4115:− 4094:Euclidean 3958:− 3945:− 3916:σ 3879:⁡ 3827:⁡ 3542:… 3442:… 3341:× 3335:× 3283:∞ 3280:↑ 3277:β 3254:↓ 3251:β 3225:β 3190:β 3176:∑ 3148:β 3134:∑ 3077:∑ 2980:∞ 2977:→ 2925:≥ 2850:∑ 2509:− 2487:symmetric 2442:− 2313:≥ 2294:a stride 2158:convolved 2097:June 2017 1962:RGB color 1583:organisms 1522:↦ 1414:kernels. 1399:× 1363:Although 956:ECML PKDD 938:VC theory 885:ROC curve 817:Self-play 737:DeepDream 578:Bayes net 369:Ensembles 150:Paradigms 106:June 2019 12684:Portals 12443:Auto-GPT 12275:Word2vec 12079:Hardware 11996:Datasets 11898:Concepts 11790:Stanford 11764:March 6, 11758:Archived 11735:12008458 11674:Archived 11653:18252373 11592:25719670 11518:Archived 11514:29933555 11353:Archived 11006:(2001). 10963:18252639 10921:Archived 10892:Archived 10661:Archived 10595:Archived 10493:Archived 10419:Archived 10415:29789447 10320:Archived 10274:22392705 10170:Archived 10065:Archived 10036:Archived 10024:17001990 9974:18255614 9921:Archived 9894:Archived 9834:Archived 9830:jmlr.org 9784:Archived 9730:Archived 9726:jmlr.org 9698:Archived 9660:32750907 9604:Archived 9478:Archived 9416:Archived 9296:Archived 9242:(2011). 9072:26353135 8872:86674074 8777:Archived 8741:Archived 8683:Archived 8632:Archived 8605:14135321 8553:Archived 8549:15411954 8489:Archived 8417:Archived 8352:Archived 8326:Archived 8324:. 2010. 8296:20858131 8240:Archived 8178:Archived 8142:Archived 8108:16764513 8056:Archived 8026:Archived 7950:Archived 7898:Archived 7874:Archived 7854:Archived 7830:Archived 7773:Archived 7721:Archived 7660:Archived 7656:20706526 7603:Archived 7580:Archived 7553:Archived 7525:Archived 7500:Archived 7473:Archived 7467:et al., 7422:Archived 7410:16764513 7227:26017442 7170:Archived 7139:14403679 7087:34690686 6920:Archived 6716:Archived 6707:, Alex. 6680:Archived 6571:Archived 6567:12850007 6511:Archived 6348:31430292 6308:PLOS ONE 6282:Archived 6205:Archived 6201:20577468 6142:Archived 6059:Archived 6047:33017950 5946:Archived 5862:Archived 5762:Archived 5712:Archived 5654:Archived 5613:Archived 5579:26017442 5424:See also 5112:checkers 5072:proteins 4957:won the 4736:Explicit 4330:aliasing 4307:Dilation 4028:training 3687:, where 2825:average 1977:CIFAR-10 1872:(GPUs). 1788:syllable 1760:ZIP Code 1687:Several 1593:Work by 1482:features 1342:channels 1332:channels 1218:inspired 1079:features 1073:type of 379:Boosting 228:Problems 12566:Meta AI 12403:AlphaGo 12387:PanGu-Σ 12357:ChatGPT 12332:Granite 12280:Seq2seq 12259:Whisper 12180:WaveNet 12175:AlexNet 12147:Flux.jl 12127:PyTorch 11979:Sigmoid 11974:Softmax 11839:General 11572:Bibcode 11505:6069475 11474:Bibcode 11466:Sensors 11444:4058889 11338:YouTube 10551:6006618 10406:5982167 10375:Bibcode 10367:Sensors 10348:(2014). 10282:1923924 9982:2883848 9572:Bibcode 8519:Systems 8304:1918673 8116:2309950 7983:Bibcode 7946:1304548 7717:8058017 7697:Bibcode 7636:Bibcode 7418:2309950 7358:8619176 7235:3074096 7207:Bibcode 7130:1363130 7078:8526843 7008:Distill 6838:2161592 6499:7370364 6454:4966457 6445:1557912 6380:Bibcode 6339:6701836 6316:Bibcode 6181:Bibcode 6114:4950757 5997:2617020 5774:others. 5551:Bibcode 5170:AlphaGo 5145:Storkey 5129:Chinook 5120:checker 5014:(LSTM) 4955:AlexNet 4789:called 4779:L2 norm 4775:L1 norm 4462:dropout 4456:Dropout 4264:AlexNet 4221:Padding 4104:labels 4051:Sigmoid 4043:Softmax 3996:, with 3787:to the 3301:Pooling 3122:, i.e. 3120:softmax 2823:Lp norm 2774:average 2483:integer 2170:feature 2154:kernels 2150:filters 2011:filters 1916:AI boom 1908:AlexNet 1814:cheques 1794:LeNet-5 1577:History 1478:filters 1468:Weights 1277:AlexNet 1253:filters 1226:neurons 1069:) is a 961:NeurIPS 778:(ECRAM) 732:AlexNet 374:Bagging 90:scholar 12581:Huawei 12561:OpenAI 12463:People 12433:MuZero 12295:Gemini 12290:Claude 12225:DALL-E 12137:Theano 11733:  11723:  11705:  11651:  11643:  11625:  11598:  11590:  11564:Nature 11512:  11502:  11494:  11442:  11434:  11424:  11303:  11293:  11014:  10961:  10826:  10760:  10549:  10539:  10521:  10485:  10458:  10413:  10403:  10395:  10280:  10272:  10264:  10246:  10217:  10199:  10112:  10030:  10022:  9980:  9972:  9954:  9916:Quartz 9666:  9658:  9650:  9598:  9590:  9525:  9446:  9410:  9400:  9355:  9345:  9158:  9115:  9070:  9062:  8990:  8937:  8905:  8870:  8862:  8714:  8677:  8603:  8547:  8537:  8483:  8473:  8411:  8403:  8302:  8294:  8236:392458 8234:  8224:  8114:  8106:  8088:  7944:  7934:  7765:  7715:  7654:  7416:  7408:  7390:  7356:  7346:  7233:  7225:  7199:Nature 7162:  7137:  7127:  7085:  7075:  7067:  7026:  6961:  6836:  6828:  6818:  6800:  6565:  6505:  6497:  6452:  6442:  6434:  6346:  6336:  6276:  6266:  6199:  6112:  6102:  6053:  6045:  6035:  5995:  5985:  5917:  5909:  5807:  5797:  5756:  5706:  5698:  5646:  5605:  5577:  5569:  5543:Nature 5386:Theano 5332:MATLAB 5328:Python 5257:, and 5149:GNU Go 5086:, and 4961:2012. 4917:retina 4230:Stride 3305:R-CNNs 3217:where 2584:kernel 2258:Stride 2240:stride 2234:Three 2224:learnt 2120:below. 2018:areas. 1599:Wiesel 1325:tensor 1083:filter 754:Vision 610:RANSAC 488:OPTICS 483:DBSCAN 467:-means 274:AutoML 92:  85:  78:  71:  63:  12647:Mamba 12418:SARSA 12382:LLaMA 12377:BLOOM 12362:GPT-J 12352:GPT-4 12347:GPT-3 12342:GPT-2 12337:GPT-1 12300:LaMDA 12132:Keras 11754:Wired 11731:S2CID 11677:(PDF) 11670:(PDF) 11596:S2CID 11543:arXiv 11521:(PDF) 11462:(PDF) 11440:S2CID 11404:arXiv 11301:S2CID 11233:arXiv 11185:arXiv 11164:arXiv 11143:arXiv 11122:arXiv 11101:arXiv 11054:arXiv 11033:arXiv 10864:arXiv 10843:arXiv 10824:S2CID 10806:arXiv 10781:arXiv 10758:S2CID 10721:arXiv 10700:arXiv 10679:arXiv 10638:arXiv 10617:arXiv 10566:arXiv 10547:S2CID 10422:(PDF) 10363:(PDF) 10338:arXiv 10297:arXiv 10278:S2CID 10142:arXiv 10092:arXiv 10039:(PDF) 10028:S2CID 10000:(PDF) 9978:S2CID 9806:arXiv 9752:arXiv 9701:(PDF) 9686:(PDF) 9664:S2CID 9596:S2CID 9481:(PDF) 9466:(PDF) 9408:S2CID 9380:arXiv 9353:S2CID 9325:arXiv 9299:(PDF) 9284:(PDF) 9256:(PDF) 9249:(PDF) 9181:arXiv 9095:arXiv 9042:arXiv 9009:arXiv 8960:arXiv 8868:S2CID 8823:arXiv 8802:arXiv 8780:(PDF) 8769:(PDF) 8601:S2CID 8583:arXiv 8545:S2CID 8492:(PDF) 8481:S2CID 8453:arXiv 8445:(PDF) 8420:(PDF) 8409:S2CID 8381:(PDF) 8300:S2CID 8274:arXiv 8243:(PDF) 8232:S2CID 8206:(PDF) 8181:(PDF) 8170:(PDF) 8145:(PDF) 8134:(PDF) 8112:S2CID 7953:(PDF) 7942:S2CID 7920:(PDF) 7776:(PDF) 7751:(PDF) 7556:(PDF) 7549:(PDF) 7452:(PDF) 7414:S2CID 7354:S2CID 7307:arXiv 7286:arXiv 7231:S2CID 7195:(PDF) 6984:arXiv 6895:arXiv 6874:arXiv 6853:arXiv 6834:S2CID 6788:arXiv 6719:(PDF) 6712:(PDF) 6683:(PDF) 6668:(PDF) 6639:arXiv 6574:(PDF) 6543:(PDF) 6514:(PDF) 6503:S2CID 6475:(PDF) 6274:S2CID 6246:arXiv 6110:S2CID 6062:(PDF) 6051:S2CID 6021:(PDF) 5993:S2CID 5949:(PDF) 5942:(PDF) 5915:S2CID 5865:(PDF) 5850:(PDF) 5754:S2CID 5704:S2CID 5471:Notes 5405:Torch 5390:NumPy 5350:Spark 5346:Scala 5320:Caffe 5209:. So 5193:with 5158:6 dan 5116:Fogel 2782:-norm 2249:depth 1900:IDSIA 1881:GPGPU 1806:LeCun 1800:LeNet 1649:The " 1595:Hubel 1273:LeNet 1206:layer 1202:layer 1163:, and 976:IJCAI 802:SARSA 761:Mamba 727:LeNet 722:U-Net 548:t-SNE 472:Fuzzy 449:BIRCH 97:JSTOR 83:books 12571:Mila 12372:PaLM 12305:Bard 12285:BERT 12268:Text 12247:Sora 11766:2017 11721:ISBN 11649:PMID 11641:ISSN 11588:PMID 11510:PMID 11492:ISSN 11432:ISSN 11422:ISBN 11291:ISBN 11086:2016 11012:ISBN 10959:PMID 10537:ISBN 10483:ISBN 10456:ISBN 10411:PMID 10393:ISSN 10270:PMID 10262:ISSN 10215:ISBN 10178:2017 10110:ISBN 10073:2016 10047:2013 10020:PMID 9970:PMID 9929:2018 9656:PMID 9648:ISSN 9588:ISSN 9541:link 9523:OCLC 9502:link 9444:OCLC 9398:ISBN 9343:ISBN 9156:ISSN 9113:ISBN 9068:PMID 9060:ISSN 8988:ISBN 8935:ISBN 8903:ISBN 8860:ISSN 8712:ISBN 8675:ISSN 8535:ISBN 8471:ISBN 8401:ISSN 8360:2019 8292:PMID 8251:2023 8222:ISBN 8104:PMID 7932:ISBN 7763:ISBN 7713:PMID 7652:PMID 7406:PMID 7344:ISBN 7223:PMID 7160:ISBN 7135:PMID 7083:PMID 7065:ISSN 7024:ISSN 6959:ISBN 6928:2013 6826:OCLC 6816:ISBN 6727:2013 6691:2013 6624:2013 6582:2013 6563:PMID 6522:2013 6495:PMID 6450:PMID 6432:ISSN 6344:PMID 6264:ISBN 6197:PMID 6100:ISBN 6043:PMID 6033:ISBN 5983:ISBN 5907:ISSN 5827:link 5823:link 5805:OCLC 5795:ISBN 5696:ISSN 5644:ISBN 5603:ISBN 5575:PMID 5567:ISSN 5415:and 5407:: A 5395:CUDA 5356:Dlib 5344:and 5342:Java 5330:and 5098:and 4979:The 4867:< 4041:The 4032:true 4016:and 3876:tanh 3824:tanh 3307:for 3228:> 2276:> 2246:The 2152:(or 1942:SIMD 1931:GPUs 1883:for 1730:The 1702:The 1691:and 1670:ReLU 1597:and 1370:each 1305:ReLU 1275:and 1111:each 986:JMLR 971:ICLR 966:ICML 852:RLHF 668:LSTM 454:CURE 140:and 69:news 12312:NMT 12195:OCR 12190:HWR 12142:JAX 12096:VPU 12091:TPU 12086:IPU 11910:SGD 11788:'s 11713:doi 11633:doi 11580:doi 11568:518 11500:PMC 11482:doi 11414:doi 11336:on 11283:doi 11261:doi 11214:doi 10986:doi 10951:doi 10816:doi 10750:doi 10529:doi 10448:doi 10401:PMC 10383:doi 10254:doi 10207:doi 10138:Net 10102:doi 10012:doi 9962:doi 9864:doi 9638:doi 9580:doi 9390:doi 9335:doi 9219:doi 9146:doi 9105:doi 9052:doi 8895:doi 8852:doi 8593:doi 8527:doi 8463:doi 8393:doi 8284:doi 8214:doi 8096:doi 8018:doi 7991:doi 7924:doi 7803:doi 7755:doi 7705:doi 7644:doi 7398:doi 7336:doi 7261:doi 7215:doi 7203:521 7125:PMC 7117:doi 7113:148 7073:PMC 7055:doi 7016:doi 6951:doi 6808:doi 6555:doi 6487:doi 6440:PMC 6424:doi 6420:195 6388:doi 6334:PMC 6324:doi 6256:doi 6189:doi 6092:doi 6025:doi 5975:doi 5899:doi 5895:177 5746:doi 5742:407 5688:doi 5559:doi 5547:521 5417:Lua 5324:C++ 4765:". 4404:by 4173:by 4100:to 3756:max 3358:to 2693:max 2087:by 1877:CPU 1821:NCR 1352:. 1344:). 1220:by 1178:or 1067:CNN 712:SOM 702:GAN 678:ESN 673:GRU 618:-NN 553:SDL 543:PGD 538:PCA 533:NMF 528:LDA 523:ICA 518:CCA 394:-NN 52:by 12732:: 11784:— 11756:. 11752:. 11729:. 11719:. 11711:. 11672:. 11647:. 11639:. 11631:. 11619:30 11617:. 11594:. 11586:. 11578:. 11566:. 11516:. 11508:. 11498:. 11490:. 11480:. 11470:18 11468:. 11464:. 11438:. 11430:. 11420:. 11412:. 11372:. 11299:, 11289:, 11210:28 11208:. 10980:. 10957:. 10947:10 10945:. 10915:. 10890:. 10886:. 10822:. 10814:. 10804:. 10779:. 10756:. 10746:33 10744:. 10593:. 10589:. 10545:. 10535:. 10527:. 10491:. 10454:. 10417:. 10409:. 10399:. 10391:. 10381:. 10371:18 10369:. 10365:. 10276:. 10268:. 10260:. 10252:. 10240:35 10238:. 10213:. 10205:. 10164:. 10108:. 10100:. 10063:. 10034:. 10026:. 10018:. 10008:17 10006:. 10002:. 9976:. 9968:. 9960:. 9946:. 9919:. 9913:. 9858:. 9832:. 9828:. 9782:. 9778:. 9766:^ 9724:. 9694:15 9692:. 9688:. 9662:. 9654:. 9646:. 9634:25 9632:. 9628:. 9602:. 9594:. 9586:. 9578:. 9568:90 9566:. 9562:. 9537:}} 9533:{{ 9521:. 9498:}} 9494:{{ 9476:. 9474:28 9472:. 9468:. 9442:. 9414:. 9406:. 9396:. 9388:. 9351:. 9341:. 9333:. 9290:. 9286:. 9265:. 9213:. 9209:. 9179:, 9154:. 9142:12 9140:. 9136:. 9111:. 9103:. 9089:. 9066:. 9058:. 9050:. 9038:37 9036:. 9032:. 8982:. 8958:, 8929:. 8917:^ 8901:. 8866:. 8858:. 8848:68 8846:. 8771:. 8756:^ 8739:. 8735:. 8698:^ 8681:. 8671:20 8669:. 8665:. 8647:^ 8626:. 8622:. 8599:. 8591:. 8579:75 8577:. 8551:. 8543:. 8533:. 8521:. 8516:. 8487:. 8479:. 8469:. 8461:. 8447:. 8415:. 8407:. 8399:. 8389:60 8387:. 8383:. 8368:^ 8350:. 8320:. 8298:. 8290:. 8282:. 8270:22 8268:. 8238:. 8230:. 8220:. 8208:. 8176:. 8172:. 8136:. 8110:. 8102:. 8094:. 8082:18 8080:. 8024:. 8012:. 7989:. 7979:37 7977:. 7948:. 7940:. 7930:. 7828:. 7824:. 7799:86 7797:. 7793:. 7771:. 7761:. 7736:^ 7719:. 7711:. 7703:. 7693:21 7691:. 7687:. 7675:^ 7658:. 7650:. 7642:. 7632:30 7630:. 7626:. 7614:^ 7591:^ 7437:^ 7420:. 7412:. 7404:. 7396:. 7384:10 7382:. 7378:. 7366:^ 7352:. 7342:. 7330:. 7273:^ 7255:. 7243:^ 7229:. 7221:. 7213:. 7201:. 7197:. 7168:. 7133:. 7123:. 7111:. 7107:. 7095:^ 7081:. 7071:. 7063:. 7051:15 7049:. 7045:. 7022:. 7010:. 7006:. 6982:, 6957:. 6945:. 6918:. 6832:. 6824:. 6814:. 6806:. 6796:. 6766:^ 6735:^ 6714:. 6674:. 6670:. 6653:^ 6610:. 6569:. 6561:. 6551:16 6549:. 6545:. 6530:^ 6509:. 6501:. 6493:. 6483:36 6481:. 6477:. 6462:^ 6448:. 6438:. 6430:. 6418:. 6414:. 6402:^ 6386:. 6374:. 6370:. 6356:^ 6342:. 6332:. 6322:. 6312:14 6310:. 6306:. 6280:. 6272:. 6262:. 6254:. 6220:^ 6203:. 6195:. 6187:. 6177:29 6175:. 6171:. 6157:^ 6140:. 6136:. 6122:^ 6108:. 6098:. 6077:^ 6057:. 6049:. 6041:. 6031:. 6014:. 5991:. 5981:. 5921:. 5913:. 5905:. 5893:. 5874:. 5856:. 5852:. 5835:^ 5819:}} 5815:{{ 5803:. 5781:^ 5771:. 5760:. 5752:. 5740:. 5736:. 5710:. 5702:. 5694:. 5684:16 5682:. 5678:. 5652:. 5628:^ 5611:. 5587:^ 5573:. 5565:. 5557:. 5545:. 5541:. 5487:, 5374:: 5366:C# 5253:, 5135:Go 5102:. 5082:, 5030:. 4976:. 4793:. 4702:. 4364:. 4345:. 4277:. 4139:. 4092:. 3848:, 3711:. 3311:. 3112:. 3022:. 2992:. 2466:1. 2222:: 1918:. 1895:. 1887:. 1836:. 1812:: 1777:. 1699:. 1617:. 1585:. 1570:. 1334:) 1315:. 1101:. 1061:A 981:ML 11825:e 11818:t 11811:v 11768:. 11737:. 11715:: 11686:. 11655:. 11635:: 11602:. 11582:: 11574:: 11551:. 11545:: 11530:. 11484:: 11476:: 11446:. 11416:: 11406:: 11387:. 11285:: 11263:: 11241:. 11235:: 11220:. 11216:: 11193:. 11187:: 11172:. 11166:: 11151:. 11145:: 11130:. 11124:: 11109:. 11103:: 11088:. 11062:. 11056:: 11041:. 11035:: 11020:. 10992:. 10988:: 10982:5 10965:. 10953:: 10930:. 10901:. 10872:. 10866:: 10851:. 10845:: 10830:. 10818:: 10808:: 10789:. 10783:: 10764:. 10752:: 10729:. 10723:: 10708:. 10702:: 10687:. 10681:: 10646:. 10640:: 10625:. 10619:: 10604:. 10574:. 10568:: 10553:. 10531:: 10502:. 10464:. 10450:: 10431:. 10385:: 10377:: 10346:. 10340:: 10305:. 10299:: 10284:. 10256:: 10223:. 10209:: 10180:. 10150:. 10144:: 10118:. 10104:: 10094:: 10075:. 10049:. 10014:: 9984:. 9964:: 9948:8 9931:. 9870:. 9866:: 9860:3 9843:. 9814:. 9808:: 9793:. 9760:. 9754:: 9739:. 9710:. 9670:. 9640:: 9613:. 9582:: 9574:: 9543:) 9529:. 9504:) 9450:. 9425:. 9392:: 9382:: 9359:. 9337:: 9327:: 9308:. 9292:1 9227:. 9221:: 9215:1 9183:: 9162:. 9148:: 9121:. 9107:: 9097:: 9074:. 9054:: 9044:: 9017:. 9011:: 8996:. 8962:: 8943:. 8911:. 8897:: 8874:. 8854:: 8831:. 8825:: 8810:. 8804:: 8789:. 8750:. 8720:. 8692:. 8628:1 8607:. 8595:: 8585:: 8562:. 8529:: 8501:. 8465:: 8455:: 8429:. 8395:: 8362:. 8335:. 8306:. 8286:: 8276:: 8253:. 8216:: 8190:. 8154:. 8118:. 8098:: 8065:. 8035:. 8020:: 7997:. 7993:: 7985:: 7962:. 7926:: 7839:. 7809:. 7805:: 7757:: 7730:. 7707:: 7699:: 7669:. 7646:: 7638:: 7565:. 7509:. 7431:. 7400:: 7360:. 7338:: 7315:. 7309:: 7294:. 7288:: 7267:. 7263:: 7257:5 7237:. 7217:: 7209:: 7179:. 7141:. 7119:: 7089:. 7057:: 7030:. 7018:: 7012:1 6986:: 6967:. 6953:: 6930:. 6903:. 6897:: 6882:. 6876:: 6861:. 6855:: 6840:. 6810:: 6790:: 6760:. 6729:. 6693:. 6676:2 6647:. 6641:: 6626:. 6584:. 6557:: 6524:. 6489:: 6456:. 6426:: 6396:. 6390:: 6382:: 6376:2 6350:. 6326:: 6318:: 6291:. 6258:: 6248:: 6214:. 6191:: 6183:: 6151:. 6116:. 6094:: 6071:. 6027:: 5999:. 5977:: 5958:. 5901:: 5858:1 5829:) 5811:. 5748:: 5721:. 5690:: 5663:. 5622:. 5581:. 5561:: 5553:: 5517:. 5495:. 5419:. 5413:C 5123:( 4890:c 4870:c 4862:2 4848:w 4816:w 4663:p 4657:1 4620:n 4616:2 4591:p 4571:n 4549:n 4545:2 4521:p 4498:p 4478:p 4472:1 4427:) 4421:( 4416:) 4412:( 4398:. 4356:k 4247:a 4245:v 4196:) 4190:( 4185:) 4181:( 4167:. 4127:) 4121:, 4112:( 4080:] 4077:1 4074:, 4071:0 4068:[ 4058:K 4047:K 3961:1 3954:) 3948:x 3941:e 3937:+ 3934:1 3931:( 3928:= 3925:) 3922:x 3919:( 3892:| 3888:) 3885:x 3882:( 3872:| 3868:= 3865:) 3862:x 3859:( 3856:f 3836:) 3833:x 3830:( 3821:= 3818:) 3815:x 3812:( 3809:f 3771:) 3768:x 3765:, 3762:0 3759:( 3753:= 3750:) 3747:x 3744:( 3741:f 3695:Q 3675:) 3672:V 3669:, 3666:V 3663:, 3660:Q 3657:( 3653:n 3650:o 3647:i 3644:t 3641:n 3638:e 3635:t 3632:t 3629:A 3626:d 3623:e 3620:d 3617:a 3614:e 3611:h 3608:i 3605:t 3602:l 3599:u 3596:M 3575:] 3572:) 3567:n 3563:v 3559:( 3555:N 3552:F 3549:F 3545:, 3539:, 3536:) 3531:1 3527:v 3523:( 3519:N 3516:F 3513:F 3509:[ 3506:= 3503:V 3482:N 3479:F 3476:F 3453:n 3449:v 3445:, 3439:, 3434:2 3430:v 3426:, 3421:1 3417:v 3373:C 3368:R 3344:C 3338:W 3332:H 3327:R 3257:0 3231:0 3198:i 3194:a 3186:e 3180:i 3168:i 3164:a 3156:i 3152:a 3144:e 3138:i 3091:j 3087:a 3081:j 3070:i 3066:a 3041:i 3037:a 3010:2 3007:= 3004:p 2974:p 2954:1 2951:= 2948:p 2928:1 2922:p 2900:p 2896:/ 2892:1 2887:) 2881:p 2876:| 2869:i 2865:a 2860:| 2854:i 2844:N 2841:1 2835:( 2780:2 2778:ℓ 2753:. 2748:b 2745:+ 2742:Y 2739:2 2736:, 2733:a 2730:+ 2727:X 2724:2 2720:S 2714:1 2709:0 2706:= 2703:b 2700:, 2697:a 2689:= 2686:) 2683:S 2680:( 2675:Y 2672:, 2669:X 2665:f 2549:1 2546:= 2543:S 2523:2 2519:/ 2515:) 2512:1 2506:K 2503:( 2500:= 2497:P 2463:+ 2458:S 2454:P 2451:2 2448:+ 2445:K 2439:W 2413:P 2393:S 2373:K 2353:W 2316:3 2310:S 2300:S 2296:S 2282:, 2279:0 2273:S 2144:. 2110:) 2104:( 2099:) 2095:( 2081:. 2009:" 1556:] 1550:x 1545:x 1538:x 1533:x 1527:[ 1519:] 1516:x 1513:[ 1402:1 1396:1 1170:. 1157:, 1151:, 1145:, 1139:, 1133:, 1127:, 1065:( 1050:e 1043:t 1036:v 616:k 465:k 392:k 350:) 338:( 119:) 113:( 108:) 104:( 94:· 87:· 80:· 73:· 46:. 23:.

Index

CNN (disambiguation)

verification
improve this article
adding citations to reliable sources
"Convolutional neural network"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
Machine learning
data mining
Supervised learning
Unsupervised learning
Semi-supervised learning
Self-supervised learning
Reinforcement learning
Meta-learning
Online learning
Batch learning
Curriculum learning
Rule-based learning
Neuro-symbolic AI
Neuromorphic engineering
Quantum machine learning
Classification
Generative modeling
Regression

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.