• In network science, a gradient network is a directed subnetwork of an undirected "substrate" network where each node has an associated scalar potential...
    12 KB (1,512 words) - 20:54, 23 May 2025
  • gradient problem is the problem of greatly diverging gradient magnitudes between earlier and later layers encountered when training neural networks with...
    24 KB (3,705 words) - 18:55, 18 June 2025
  • Thumbnail for Network topology
    Broadcast communication network Butterfly network Computer network diagram Gradient network Internet topology Network simulation Relay network Rhizome (philosophy)...
    40 KB (5,238 words) - 09:07, 24 March 2025
  • Backpropagation (category Artificial neural networks)
    machine learning, backpropagation is a gradient computation method commonly used for training a neural network to compute its parameter updates. It is...
    56 KB (7,993 words) - 15:52, 29 May 2025
  • extension of gradient descent, stochastic gradient descent, serves as the most basic algorithm used for training most deep networks today. Gradient descent...
    39 KB (5,600 words) - 18:38, 18 May 2025
  • Gradient boosting is a machine learning technique based on boosting in a functional space, where the target is pseudo-residuals instead of residuals as...
    28 KB (4,259 words) - 20:19, 14 May 2025
  • providing a unifying view of gradient calculation techniques for recurrent networks with local feedback. One approach to gradient information computation in...
    90 KB (10,419 words) - 09:51, 27 May 2025
  • Thumbnail for Neural network (machine learning)
    values in a given dataset. Gradient-based methods such as backpropagation are usually used to estimate the parameters of the network. During the training phase...
    169 KB (17,641 words) - 00:21, 11 June 2025
  • intelligent agent. Specifically, it is a policy gradient method, often used for deep RL when the policy network is very large. The predecessor to PPO, Trust...
    17 KB (2,504 words) - 18:57, 11 April 2025
  • conjugate gradient (Fletcher–Reeves update, Polak–Ribiére update, Powell–Beale restart, scaled conjugate gradient). Let N {\displaystyle N} be a network with...
    12 KB (1,790 words) - 11:34, 24 February 2025
  • Thumbnail for Rectifier (neural networks)
    initialized network, only about 50% of hidden units are activated (i.e. have a non-zero output). Better gradient propagation: fewer vanishing gradient problems...
    23 KB (3,056 words) - 12:14, 15 June 2025
  • Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e...
    53 KB (7,031 words) - 21:06, 15 June 2025
  • Thumbnail for Feedforward neural network
    Shun'ichi Amari reported the first multilayered neural network trained by stochastic gradient descent, which was able to classify non-linearily separable...
    21 KB (2,242 words) - 20:16, 25 May 2025
  • Thumbnail for Network science
    theory Gradient network Higher category theory Immune network theory Irregular warfare Network analyzer Network dynamics Network formation Network theory...
    69 KB (9,905 words) - 15:52, 14 June 2025
  • Thumbnail for Residual neural network
    discovered the vanishing gradient problem in 1991 and argued that it explained why the then-prevalent forms of recurrent neural networks did not work for long...
    28 KB (3,042 words) - 23:27, 7 June 2025
  • performance. In very deep networks, batch normalization can initially cause a severe gradient explosion—where updates to the network grow uncontrollably large—but...
    30 KB (5,892 words) - 04:30, 16 May 2025
  • training the wide neural network and kernel methods: gradient descent in the infinite-width limit is fully equivalent to kernel gradient descent with the NTK...
    35 KB (5,146 words) - 10:08, 16 April 2025
  • Weight initialization (category Artificial neural networks)
    speed of convergence, the scale of neural activation within the network, the scale of gradient signals during backpropagation, and the quality of the final...
    24 KB (2,916 words) - 09:19, 25 May 2025
  • Thumbnail for Mathematical optimization
    gradient method (Frank–Wolfe) for approximate minimization of specially structured problems with linear constraints, especially with traffic networks...
    53 KB (6,155 words) - 23:42, 31 May 2025
  • as the transformer. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by the regularization...
    138 KB (15,585 words) - 07:00, 4 June 2025
  • Thumbnail for Reinforcement learning
    Williams, Ronald J. (1987). "A class of gradient-estimating algorithms for reinforcement learning in neural networks". Proceedings of the IEEE First International...
    69 KB (8,194 words) - 13:01, 17 June 2025
  • Thumbnail for Generative adversarial network
    high-dimensional space of all possible neural network functions. The standard strategy of using gradient descent to find the equilibrium often does not...
    95 KB (13,887 words) - 09:25, 8 April 2025
  • Gating mechanism (category Neural network architectures)
    In neural networks, the gating mechanism is an architectural motif for controlling the flow of activation and gradient signals. They are most prominently...
    8 KB (1,166 words) - 21:49, 27 January 2025
  • Thumbnail for Seven Network
    five stations of the network. The logo was simplified in 2003, effectively becoming simply two angled trapezoids, losing its gradient, shadows and colour-coded...
    100 KB (9,979 words) - 08:12, 15 June 2025
  • network Dynamic network analysis Dynamic single-frequency networks Gaussian network model Gene regulatory network Gradient network Network planning and design...
    3 KB (263 words) - 23:22, 26 August 2023
  • by the gradient of the function at the current point. Examples of gradient methods are the gradient descent and the conjugate gradient. Gradient descent...
    1 KB (109 words) - 05:36, 17 April 2022
  • Backpropagation through time (category Artificial neural networks)
    through time (BPTT) is a gradient-based technique for training certain types of recurrent neural networks, such as Elman networks. The algorithm was independently...
    6 KB (745 words) - 21:06, 21 March 2025
  • Thumbnail for You Only Look Once
    You Only Look Once (category Neural networks)
    with the highest IoU with the ground truth bounding boxes is used for gradient descent. Concretely, let j {\displaystyle j} be that predicted bounding...
    10 KB (1,222 words) - 21:29, 7 May 2025
  • Graph neural networks (GNN) are specialized artificial neural networks that are designed for tasks whose inputs are graphs. One prominent example is molecular...
    43 KB (4,791 words) - 17:50, 17 June 2025
  • the vanishing gradient problem. As long as the forget gates of the 2000 LSTM are open, it behaves like the 1997 LSTM. The Highway Network of May 2015 applies...
    11 KB (1,316 words) - 20:57, 10 June 2025