# Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of ...

**0**

votes

**0**answers

10 views

### Multivariable gradient descent Matlab - what is the difference between the two codes?

The following function finds the optimum "thetas" for a regression line using gradient descent. The inputs (X,y) are appended below. My question is what is the difference between code 1 and code 2? ...

**0**

votes

**0**answers

19 views

### Neural network Numpy - Gradient and L2

I have a neural network for linear regression: class Neural(object):def __init__(self,x,y,hiddensize1,output_size,cost=None,alpha=0.01,reg_coef=0.1):std=0.1self.x=x...

**-4**

votes

**0**answers

47 views

### Neural network to recognize letters [on hold]

I have to create a neural network with one hidden layer and train it using gradient decent algorithm. The input will be a 2d grid of binary numbers to represent a letter. I am using a 15 by 7 grid of ...

**2**

votes

**1**answer

29 views

### Positional argument error when trying to train SGDClassifier on binary classification

I'm working through Aurelien Geron's Hands-On ML textbook and have got stuck trying to train an SGDClassifier. I'm using the MNIST handwritten numbers data and running my code in a Jupyter Notebook ...

**0**

votes

**0**answers

47 views

### Gradient Descent in Python from scratch produces wrong output

I am trying to implement Gradient Descent in just random data to predict weight based on height and sex in python. Below is the dataset I have created using just random values and also the set of ...

**2**

votes

**1**answer

46 views

### How does gradient descent weight/bias update work here?

I've been learning neural networks from Michael Nielsen's http://neuralnetworksanddeeplearning.com/chap1.html.In the section below to update the weights and biasesdef update_mini_batch(self, ...

**0**

votes

**0**answers

19 views

### Tensorflow: Attempting to generate input given output vector after training

I have a tensorflow model that can map 6 input variables to 2 output variables with fairly good accuracy. What I want to do now is have a method of generating the inputs when given an output. My ...

**0**

votes

**3**answers

45 views

### pytorch how to set .requires_grad False

I want to set some of my model frozen. Following the official docs: with torch.no_grad():linear=nn.Linear(1, 1)linear.eval()print(linear.weight.requires_grad)But it prints True ...

**-1**

votes

**0**answers

15 views

### Implementation of a kernel SVM, using stochastic gradient descent

I would like to use a a kernel support vector machine, to classify mails into spam or ham. My problem regards how to update the weights, by using the gradient of the hinge loss.The kernel here can ...

**4**

votes

**2**answers

59 views

### Tensorflow, Keras: How to create a trainable variable that only update in specific positions?

For example, y=Axwhere A is an diagonal matrix, with its trainable weights (w1, w2, w3) on the diagonal. A=[w1 ... ...... w2 ...... ... w3]How to create such trainable A in ...

**0**

votes

**1**answer

43 views

### What is wrong with this gradient descent algorithm?

X_train is already normalized using StandardScaler() and also the categorical columns have been converted into one hot encodings. X_train.shape=(32000, 37)I am using the following code to compute ...

**0**

votes

**0**answers

17 views

### Which of these is the correct implementation of cosine decay (learning-rate reweighting for neural nets)?

I've read the a Loshchilov & Hutter paper on Stochastic Gradient Descent with Warm Restart (SGDR), and I've found at least one implementation of it for keras (like this one). However, I can ...

**0**

votes

**0**answers

42 views

### SGD diverging after changing learning rate

I am creating a function for stochastic gradient descent with ridge regression. I am keeping the step size constant for 1800 iterations and then changing it to 1/n or 1/sqrt(n). When I use 1/sqrt(n) ...

**1**

vote

**1**answer

43 views

### Logistic regression cost change turns constant

After just a few iterations of gradient descent, the cost function change turns constant which is most definitely how it should not perform:The initial result of the gradient descent function seems ...

**1**

vote

**0**answers

15 views

### Unable to get the correct SVM gradient using vectorization

I was attempting the CS231n Assignment 1 for vectorizing the computation of the SVM gradients. dW is the gradient matrix. The following is my attempt:def svm_loss_vectorized(W, X, y, reg):...