This video includes lyric on the screen
-------------------------------------------
Google Pixel 3 may finally get this major missing feature on the Pixel 2 ● Tech News ● #TECH - Duration: 2:18.GOOGLE'S Pixel 3 continues to generate excitement and new leak could reveal that wireless charging
is finally coming to search giant's flagship phone.
The Google Pixel 3 looks set to bring a swathe of new features to this hugely popular smartphone
brand.
Rumours are rife that Google is likely to include a faster Qualcomm Snapdragon 845 processor,
improved camera and better battery life.
It's also thought that the larger Pixel 3 XL will get a full edge-to-edge screen which
will cover almost the entire front of the device.
In fact, the only part of the Pixel that won't be filled with this display is a small notch
at the top and slight chin at the bottom of the phone.
Now another new leak may reveal that the latest Pixel is getting something many were hoping
would appear on the Pixel 2 - wireless charging.
9to5Google is reporting that the alongside the updated phone will be a new accessory
called the Pixel Stand.
This dock is thought to be able to add power to the Pixel 3 without the need for wires
and that's not all.
Once placed on the Stand users may also set seamless Google Assistant integration even
when their phone is locked.
The Pixel 2 doesn't include wireless charging and that puts it behind some key rivals including
Samsung, LG and Apple.
Adding this new way of refilling the device would clearly be a popular move and would
certainly make a sense as it will future-proof the phone.
This latest leak comes as a new picture was recently released on the web which claims
to show how the larger Pixel 3 XL in all of its glory.
Gizmochina says it has managed to see images of a set of CAD drawings detailing how the
phone will look inside a case.
Although Google seems to be going for a single camera on the back, there could be the inclusion
of a double front-facing system which may add some DSLR-style depth of field to selfies.
Two final features revealed in the image is the space for a stereo speaker at the base
of the phone and a rear-mounted fingerprint scanner in the middle of the device.
-------------------------------------------
Pocztówka z 1905 r. Pomnik Kopernika z pałacem Karasia w tle - Duration: 0:58.In the varsaviana series, we present a postcard addressing the subject
seemingly captured in tens of thousands of identical ones
shots and well-integrated,
that is, a postcard with the Copernicus monument.
But... postcard postcard uneven.
Of course, the presented card is nicely framed, but that's not the point.
In the foreground, a nice eye moves with the figures from the early twentieth century,
on the second, he captures our outstanding scholar,
but what's interesting,
is only in the background,
(we count on) - on the third plan.
Well, in the background behind the monument is located the Karaś Palace, which does not exist today,
which after many protests, discuses and adventures, was finally demolished in 1912.
Thus, the postcard is, however, extraordinary -
Quod erat demonstrandum.
Please visit www.atticus.pl
to the varsaviana department and old postcards.
-------------------------------------------
Live in the D: The stray python in Ferndale finds a home - Duration: 4:49. For more infomation >> Live in the D: The stray python in Ferndale finds a home - Duration: 4:49.-------------------------------------------
Audi A5 Sportback 2.0 TDI quattro Pro Line S Schuifdak, Leer , Navi Xenon Led, DVD, full - Duration: 1:14. For more infomation >> Audi A5 Sportback 2.0 TDI quattro Pro Line S Schuifdak, Leer , Navi Xenon Led, DVD, full - Duration: 1:14.-------------------------------------------
FIRST concert Ever! TAYLOR SWIFT Reputation Tour! - Duration: 4:49.so tonight mommy is going to a concert I'm really excited I'm going to Taylor Swift
I'm gonna get ready while Kenzie plays with her new princess castle daddy's
gonna watch the kids while Mommy goes
here's what I'm wearing tonight
Kenzie likes getting ready with me and what's this Kenzie it's a spray and
spray and it's sparkly and it has like it okay you're ready yeah thank you
shake it up it's like fairy dust huh yeah really good it's called a thousand
wishes you ready for the fairy spray close your eyes
spin around uh now you're all sparkly Kenzie's helping you pick out some shoes
which ones do you think I should wear she grabbed these ones and said these
ones are the best they're like sparkly and glittery spiky
okay so you can are they hard to walk in
so Kenzie what are you wearing today
you have a headband you have these bracelets tell me about these bracelets
birthday you know who you remind me of
all three burgers for Heather yes I got three burgers
I've got a big appetite okay so we're walking into the concert it was kind of
a crazy journey here because I'm really bad at driving in a direction but we're
here let me show you the view
-------------------------------------------
My SECRET to Shutting Down Haters - Duration: 6:04. For more infomation >> My SECRET to Shutting Down Haters - Duration: 6:04.-------------------------------------------
Steve Rogers Meets Natasha Romanoff And Bruce Banner | Marvel's The Avengers (2012) - Duration: 4:21.Stow the captain's gear.
Yes, sir.
Agent Romanoff, Captain Rogers.
- Ma'am. - Hi.
They need you on the bridge.
They're starting the face-trace.
See you there.
It was quite the buzz around here, finding you in the ice.
I thought Coulson was gonna swoon.
Did he ask you to sign his Captain America trading cards yet?
Trading cards?
They're vintage. He's very proud.
Dr Banner.
Yeah, hi.
They told me you would be coming.
Word is, you can find the Cube.
Is that the only word on me?
Only word I care about.
It must be strange for you, all of this.
Well, this is actually kind of familiar.
Gentlemen, you might want to step inside in a minute.
It's going to get a little hard to breathe.
Flight crew, secure the deck.
Is this a submarine?
Really?
They want me in a submerged, pressurised, metal container?
No, no, this is much worse.
Hover power check complete. Position cyclic.
Increase collective to 8.0%.
Preparing for maximum performance takeoff.
Increase output to capacity.
Power plant performing at capacity.
We are clear.
All engines operating.
S.H.I.E.L.D. Emergency Protocol 193.6 in effect.
- We are at level, sir. - Good.
Let's vanish.
Engage retro-reflection panels.
Reflection panels engaged.
Gentlemen.
Doctor, thank you for coming.
Thanks for asking nicely.
So, how long am I staying?
Once we get our hands on the Tesseract,
you're in the wind.
Where are you with that?
We're sweeping every wirelessly accessible
camera on the planet.
Cell phones, laptops...
If it's connected to a satellite, it's eyes and ears for us.
That's still not gonna find them in time.
You have to narrow your field.
How many spectrometers do you have access to?
- How many are there? - Call every lab you know.
Tell them to put the spectrometers on the roof
and calibrate them for gamma rays.
I'll rough out a tracking algorithm, basic cluster recognition.
At least we could rule out a few places.
Do you have somewhere for me to work?
Agent Romanoff,
could you show Dr Banner to his laboratory, please?
You're gonna love it, Doc. We got all the toys.
-------------------------------------------
Do You know? #July31st - Duration: 1:42.Do You know that July 31st 2012 - Michael Phelps beats Larisa Latynina's record number
of Olympic medals.
Michael Fred Phelps was born on June 30, 1985 in Baltimore, Maryland.
He is an American swimmer and he holds world records in several events.
Phelps won eight medals at the 2004 Summer Olympics in Athens.
Six of those were gold.
These medals made him tie the record for medals at a single Olympics.
Alexander Dityatin has held this record since 1980.
In 2008, Phelps won eight swimming gold medals at the Summer Olympics in Beijing.
This broke Mark Spitz'srecord for most gold medals in a single Olympics.
Spitz had won seven gold medals at the 1972 Summer Olympics.
At the 2012 Summer Olympics in London, Phelps won another four gold and two silver medals.
At the 2016 Summer Olympics in Rio de Janeiro, Phelps won five gold medals and one silver.
In total he has won 28 Olympic medals, a record.
23 of these are gold medals, over two times as many as the former record.
-------------------------------------------
Learning Maths-Graphs - Duration: 0:46. For more infomation >> Learning Maths-Graphs - Duration: 0:46.-------------------------------------------
'Segundo Sol': Valentim confronta Karola ao descobrir desvio milionário de Beto - Duration: 7:44. For more infomation >> 'Segundo Sol': Valentim confronta Karola ao descobrir desvio milionário de Beto - Duration: 7:44.-------------------------------------------
Awesome First Time Offroad Adventure - Barnwell Mountain - Duration: 3:29.Yee Haw!!
this is kinda scary agh
(screaming)
oh my gosh
she wants you to do it
NO! NO NO NO!
Oh that wasnt too bad
Alright. We made it to Barnwell Mountain
Its olivias first ride really doing offroading in a RZR
what did ya think?
FUN
it was fun?
what was your favorite part?
(driving sounds) going up and down
and muddy! and you got muddy
YEAH!!!
it splashed us everywhere
i know
i want to do it again you want to do it again?
well i think were gonna eat lunch
you mean dinner?
dinner
were gonna eat dinner then we will go back out
ok
then we will do that again
WooHoo!
can i take this off?
WooHoo.
Yeah take it off
dont look down
(laughter and screams)
GiGi is gonna get even muddy
i wonder what theyre gonna think about that?
Was that a long ride?
Did you fall asleep?
Yeah Alot.
It was rough yeah
Nobody's gonna let me drive!?
Can I go take a bath?
-------------------------------------------
Tom Hardy's Transformation in the Venom Trailer Will Terrify You - Duration: 1:18. For more infomation >> Tom Hardy's Transformation in the Venom Trailer Will Terrify You - Duration: 1:18.-------------------------------------------
Audi A3 1.6 TDI 110 PK S-Tronic Sportback Attraction (BNS) - Duration: 1:05. For more infomation >> Audi A3 1.6 TDI 110 PK S-Tronic Sportback Attraction (BNS) - Duration: 1:05.-------------------------------------------
INCREDIBLE HIDDEN TREASURES IN THE MONTE DEL PIRENÉE 0RIENTALE | Babylon Metal Detector - Duration: 12:17.here it is
ole !!
a medal !!
to see what is this that sounds so good?
everything is stones
what joy ...
here it is
a tin
ancient
Is it copper?
it's like copper ... it's green, you see?
to know what is
okay...
trash
it can be currency
here ... currency
yes
it's a "ARDITE"
here we have you
well ... let's go well
beautiful!!
very high signal and it sounds erratic
or an iron horseshoe ...
or a can
I wish it were something good ... but
I have my doubts, here is
it's an iron
thin iron like ...
a knife folded or ...
yes, it has a knife shape
knife blade, but it's undone
so...
we bury it again that it ends up rotting
and it is already
I think it's another currency
lets go see it
seal...
it's a lead seal
Seal
It is similar to a coin
let's see it ... to believe it
here it is...
Ole!!
a medal and this ...
another Roman medal
this is the 7 ...
I found another one like this
look how pretty
the 7 mysteries ... or something like that
has the hitch still ...
beautiful, no?
a trash
foil
very big this will be a can
next to these skewers
here is an ox horseshoe
horseshoe of ox ... today will be good day
One day is not good if I do not get a horseshoe
my recipe ... one a day at least
It's a 68
theoretically it is copper
A zipper
modern
I ran out of battery ...
there is a signal ...
similar to copper
it's a lead
it's a lead from the wheels
to balance the wheels
pity!!
can
a horseshoe in the sun
to the bag goes
wow !!
It has a decoration
has here like a flower
I was above
It's a piece ...
ah, it looks like a bullet
a bullet fired
look ... currency!
Ole...Ole!!...a "SEISENO"
It's missing a corner, but ...
You see ... pretty
coin ... pretty
here is ... a bullet
a musket bullet
here ... currency
Ole! "ARDITE"
ardite in sight
all right!!
he resisted
I already said that it was a good sign
after this ARDITE I found another SEISENO
and this bronze PREMONEDA
THANK YOU FOR SEEING OUR VIDEOS ... WE APPRECIATE A LIKE AND SUBSCRIPTION
-------------------------------------------
Herzogin Meghan: Jetzt eskaliert die Familien-Situation komplett! - Duration: 3:52. For more infomation >> Herzogin Meghan: Jetzt eskaliert die Familien-Situation komplett! - Duration: 3:52.-------------------------------------------
DS DS 4 PURETECH 130PK S&S CHIC DAB+/18''/NAVI - Duration: 1:07. For more infomation >> DS DS 4 PURETECH 130PK S&S CHIC DAB+/18''/NAVI - Duration: 1:07.-------------------------------------------
"je récupère mon ex" quelle est la méthode? - Duration: 12:27. For more infomation >> "je récupère mon ex" quelle est la méthode? - Duration: 12:27.-------------------------------------------
Live in the D: Uniquely Detroit - Drew & Mike Podcast - Duration: 3:01. For more infomation >> Live in the D: Uniquely Detroit - Drew & Mike Podcast - Duration: 3:01.-------------------------------------------
Live in the D: The stray python in Ferndale finds a home - Duration: 4:49. For more infomation >> Live in the D: The stray python in Ferndale finds a home - Duration: 4:49.-------------------------------------------
Live in the D: Shop at People' Records in the Eastern Market District - Duration: 4:33. For more infomation >> Live in the D: Shop at People' Records in the Eastern Market District - Duration: 4:33.-------------------------------------------
Live in the D: JLF Paddle Boards are almost like walking on water - Duration: 4:56. For more infomation >> Live in the D: JLF Paddle Boards are almost like walking on water - Duration: 4:56.-------------------------------------------
Trump Voices Skepticism on Sale of 3-D Printed Guns - Duration: 4:36. For more infomation >> Trump Voices Skepticism on Sale of 3-D Printed Guns - Duration: 4:36.-------------------------------------------
Une maman a tué ses enfants à cause d'une erreur bête en leur préparant le petit déjeuner - Duration: 6:23. For more infomation >> Une maman a tué ses enfants à cause d'une erreur bête en leur préparant le petit déjeuner - Duration: 6:23.-------------------------------------------
Honda CR-V 2.2D ELEGANCE Leer interieur, Panoramadak - Duration: 1:11. For more infomation >> Honda CR-V 2.2D ELEGANCE Leer interieur, Panoramadak - Duration: 1:11.-------------------------------------------
Toyota Avensis Wagon 2.0 D-4D Executive Business Leder+Navigatie - Duration: 1:08. For more infomation >> Toyota Avensis Wagon 2.0 D-4D Executive Business Leder+Navigatie - Duration: 1:08.-------------------------------------------
Medina: Ethnic Studies Key to Students Success - Duration: 2:59.What do we want... Ethnic studies. When do we want it... Now
What do we want... Ethnic studies. When do we want it... Now
For people like Jose Lara, ethnic studies represents more than just a class. He's a social studies
teacher in the Los Angeles unified school district. For Jose ethnic studies are the
keys to the hearts and souls of the young people he has worked with.
Ethnic studies helps students in every study that have been shown
has shown increased graduation rates increased academic rates
increase attendance rates and a decrease of drop out rate for students to take ethnic studies.
As a school board member of the Los Angeles area Lara got the LAUSD to make
ethnic studies a requirement for graduation. I was doing really poorly in school I didn't
really care about school.
Karla Gomez Pelayo a UC Berkeley graduate in ethnic studies credits it with giving her direction in life.
Ethnic studies saved me and I mean that literally and figuratively
it reengage me in my education it helped me understand my cultural roots and identity.
It is a very empowering experience for students. It is something that students crave for and
it's almost like a light goes on.
It is testimony like this and his personal experience of many
years in the classroom as a teacher that motivated Assemblymember Jose Medina to push
forward legislation to make ethnic studies classes mandatory in California high schools.
We're not talking about something that's just a benefit to certain ethnic groups to Latinos or
African Americans but ethnic studies has demonstrated that it can raise
academic achievement across all groups.
Assembly bill 2772 would make ethnic studies at high school graduation requirement starting in 2023.
What you do can make a difference and I was a recurring theme in ethnic studies.
Danielle is a student at Hiram Johnson high school in Sacramento enrolled in ethnic studies.
I was born in Sacramento I also am native American, Puerto Rican, Portuguese,
White, Mexican.
And..
Yeah and I have family from all over the world basically. So like people hearing my stories
I would hope that they understand what I've been through and what I'm going through and
then kind of make that connection with me.
As people ask about the cost I would say what is the cost of not knowing,
you know, what is the cost of the having a society that is divided.
Ethnic studies is good for all students even students whose groups are not being studied.
White students do well when they take ethnic studies classes just as African American students
do well when they take ethnic studies classes. There's no study out there
thats shown so far that ethnic studies actually does bad for students.
Ethnic studies builds empathy for the students, it builds a better community a better society
and improves academic grades. It's a winner for everybody all around.
-------------------------------------------
Eduardo Giannetti – Os melhores livros para aprender psicologia - Duration: 1:03.I'm working to overcome this dichotomy between fiction and non-fiction.
I find much more true knowledge
about human deep psychology
in good novels written by Dostoyevsky than in Psychology handbooks.
I believe that it contains more knowledge,
especially when it comes to subjectivity.
I believe that creators working with Literature
have been much more careful with their works,
with their attempt to make evident what happens
inside our minds, with all its drives, emotions and expectations
and also, to analyze the dreams of each person.
And I find this knowledge to be extremely valuable.
I think that it has to be mobilized. And literary genres doesn't matter that much.
The quality of workmanship, the final result, that's what matters.
-------------------------------------------
Speaker Of The House - Ceasefire (Lyrics) feat. HICARI - Duration: 3:35.Ceasefire
Ceasefire, Ceasefire
Isn't it enough
Ceasefire
Can we just call it a ceasefire, Yeah yeah, ceasefire
Isn't it enough that we hate each other, Other, So
Ceasefire
Ceasefire
Can we just call it a ceasefire, Yeah yeah, ceasefire
Isn't it enough that we hate each other, Other, So
Can we put what we've done behind us, in silence
Can we just call it a ceasefire, Ceasefire
Now, even mama's asking if it's true or false
You're the one behind it all
When you were the one who had got my back, but
It's easy for you say shit like this
Why you so afraid of talking face to face, When
You had to have the final say
Me into tiny pieces when you walked away
Was is not enough to break
Ceasefire
Ceasefire
Ceasefire
Can we just call it a ceasefire, Yeah yeah, ceasefire
Isn't it enough that we hate each other, Other, So
Can we put what we've done behind us, in silence
Can we just call it a ceasefire, Ceasefire
Since I lost you I got nothing left to lose
Fighting's all we seem to do
When I'm not around to reject, dismiss it
It's easy for you say shit like this
half the truth, well
But somethings tells me you've been spreading
They're telling me they heard the news
These people like to stare at me across the room
I'm not gonna lie to you
-------------------------------------------
My SECRET to Shutting Down Haters - Duration: 6:04. For more infomation >> My SECRET to Shutting Down Haters - Duration: 6:04.-------------------------------------------
On momentum methods and acceleration in stochastic optimization - Duration: 51:33.>> Yeah. So, I'll be talking about momentum methods
and acceleration and stochastic optimization.
This is joint work with
Prateek Jain back in Microsoft Research in India,
Sham Kakade who's going to be at
the University of Washington and
Aaron Sidford from the Stanford University.
Okay. Let me start by
giving a brief overview and context of our work.
As many of us might already be aware of,
optimization is a really crucial component
in large-scale machine learning these days.
Stochastic methods, such as Stochastic Gradient Descent,
are some of the workhorses
that power this optimization routines.
There are three key aspects with which
Stochastic Gradient Descent is used in practice
in this optimization routines
and which are crucial for its good performance.
So, the first one is minibatching where we compute
the Stochastic Gradient not over
a single example but rather over a batch of examples.
The second one is what is known as model averaging,
where you run independent copies
of Stochastic Gradient Descent on
multiple machines and try to somehow combine
the results to get a much
better model than each of them along.
Finally, there is a technique from
deterministic optimization called
acceleration or momentum,
which is also used on top of Stochastic Gradient methods.
All of these three things seem to be important
for the good performance of
Stochastic Gradient Descent in practice.
However, from a understanding point
of view or a theory point of view,
we still don't have a very good understanding
of what are the effects of these methods
on Stochastic Gradient Descent and do
they really help and if they help,
how are they helping and
all these various aspects with which these are used.
Our work here is
a thorough investigation of all of these three aspects,
for the specific case of stochastic linear regression.
The reason that we consider
stochastic linear regression is because
the very special structure of the problem lets
us gain a very fine understanding
of each of these three aspects.
At the same time,
the results and intuitions
that we gained from this seem to have
some relevance to even more complicated problems,
such as training deep neural networks,
and I'll touch upon that topic
towards the end of my talk.
In this talk in particular,
I'll be talking only about the last aspect,
which is acceleration and
momentum algorithms on top
of Stochastic Gradient Descent.
So, this is basically the outline
or the high level subject of my talk.
Let me now start by giving
a brief introduction to deterministic optimization.
A lot of you might already know this but I
thought if I just spend a few minutes on this,
it might be helpful to
follow the rest of the talk even more easy.
So, bear with me if you already know a lot of this stuff.
So, gradient descent is of course one of
the most fundamental optimization algorithms
and given a function f,
we start with a point w0 and at every iteration,
we move in the negative gradient direction.
As I mentioned, the problem that we'll be
considering in the stop is that of linear regression.
Here, we are given a matrix X and
a vector Y and we wish to find the vector w,
which minimizes the square of X transpose w minus Y.
This is the linear regression problem
and as you can imagine,
this is a very basic problem and
arises in several applications
and people have done a lot of
work in understanding how to solve this problem well.
In particular, the first question that comes to our mind
is what is the performance of
gradient descent when you
apply it to the linear regression problem?
In order to answer this question quantitatively,
we need to introduce this notion of condition number,
which is just the ratio of the largest to
smallest eigenvalues of the matrix X, X transpose.
With this notation, it turns out
that gradient descent finds an epsilon suboptimal point.
What I mean by that is that f of
w minus the optimum is at most epsilon.
So, gradient descent finds
an epsilon suboptimal point in about
condition number log of
initial sub-optimality over targets
of optimality number of iteration.
So, this is the number of
iterations that gradient descent
takes to find an epsilon suboptimal point.
The next question that comes to mind
is whether it's possible to do any better
than this gradient descent rate
and there is hope because gradient descent,
even though it has seen gradients
of all the path iterates,
it only uses the current
gradient to make the next update.
So, the hope would be if there is
a more intelligent way to
utilize all the path gradients that we
have seen to make a better step then
maybe we can get a better rate and this intuition
turns out to be true and there are
several famous algorithms which
actually achieve this kind of improvement.
Some of the famous examples include
conjugate gradient and heavy ball method,
as well as Nesterov's celebrated
accelerated gradient method.
The last two methods in particular are
also known as momentum methods,
and we'll see why they're called momentum methods.
So, as a representative example,
let's look at what Nesterov's
accelerated gradient descent looks like.
While gradient descent had just one iterate,
Nesterov's accelerated gradient can
be thought of as having two different iterates.
We denote them with Wt and Vt and there are
two kinds of steps: gradient steps and momentum steps.
So, from Vt, we take a gradient step to
get Wt plus 1 and from Wt plus 1,
we take a moment of step to get Vt plus 1.
Then we again take a gradient step.
So, it alternates between
these gradient and momentum steps.
This is a Nestrov gradient algorithm
with appropriate parameters of course.
This is how it looks in the equation form.
The exact form is not important
but one thing that I want to point out here is
that the amount of time taken for a single iteration of
Nesterov's accelerated gradient is pretty much the
same as that taken by one iteration of gradient descent.
So, in terms of time per iteration,
it's exactly the same up two constants.
However, the convergence rate
of Nesterov's accelerated gradient turns out to
be square root of condition number log
of initial or target for optimality.
So, if you compare it to the rate of gradient descent,
we see that we get an improvement of
square root of condition number.
Condition number is defined as a max
over min so it's always greater than or equal to 1.
So, this is always
an improvement to our gradient descent.
This is actually not just a theoretical improvement.
In fact, even on moderately conditioned problems,
here we're looking at
a linear regression problem where
the condition number is about 100.
You see that Nesterov's gradient, which is in the red,
is an order of magnitude
faster than gradient descent which is in the blue.
So, this actually really has
practical advantages in getting
this better convergence rate.
So, this is all I wanted to
say about deterministic optimization.
If we come to the kind of
optimization problems that we
encounter in machine learning,
we usually have a bunch of training data X1,
Y1 up to Xn, Yn.
Let's say we get all of this data
from some underlying distribution on Rd cross R. So,
this underlying dimension and
we get the strength in data.
We can use whatever we saw so far to solve
the training loss which is just 1 over
n summation Xi transpose w minus yi whole square.
So, the f hat of W that we have
here is the training loss and we
could use either gradient descent or
Nesterov's accelerated gradient
to solve the training loss.
But in machine learning,
we're not really interested in optimizing
the training loss per se,
what we are really interested in is
optimizing the test loss or test error,
which means that if you are given
a new data point from the underlying distribution,
we want to minimize X transpose w minus y whole square,
where X and Y is sampled
uniformly at random from
the same underlying distribution.
In order to optimize this function however,
we cannot directly use the gradient or
accelerated gradient method from before because
the gradients here can be written as
expectations and we don't really
have access to exact expectations.
All we have access to is
these samples that we have seen from the distribution.
This setting has also been well-studied
and goes by the name of stochastic approximation.
In a similar paper, Robbins and
Monro introduced what is known
as Stochastic Gradient algorithm to
solve these kinds of problems.
The main idea is extremely simple,
which is that in any algorithm that you take,
gradient descent for instance,
wherever you are using gradient for the update,
you replace it with a Stochastic Gradient.
Then the Stochastic Gradient is
calculated on just using a single point.
In expectation, because the Stochastic Gradient
is equal to the gradient,
we're in expectation doing gradient descent and
we would hope that this also
has good convergence properties.
Moreover, if you look at the para-iteration cost
of Stochastic Gradient Descent, it's extremely small.
So, the Stochastic Gradient completion just
requires us to compute that thing in the red
for the linear regression problem,
which requires us to just take
one look at the data point and that's about it.
So, making each of these Stochastic Gradient updates is
extremely fast and because of this efficiency,
it's widely used in practice.
While I give you this example on top of
gradient descent and how to
convert it into Stochastic Gradient Descent,
you could apply the same framework
to any deterministic optimization algorithm.
If you take heavy ball, you
will get a stochastic heavy ball.
If you take Nesterov's accelerated gradient,
you will get stochastic Nesterov accelerated gradient.
The recipe is just simple.
You will have an updated equation,
just replace gradients to Stochastic Gradients
and you get these stochastic methods.
Okay. So, now let's again come back to
Stochastic Gradient Descent and
try to understand the convergence rate of
Stochastic Gradient Descent in
the stochastic approximation setting.
For illustration, if you just
consider the lifeless setting
where y is exactly equal to x transpose w star.
So, there is some underlying truth w star and
our observations y are
exactly equal to x transpose w star.
Then the convergence rate of
Stochastic Gradient Descent again turns out
to be very similar to that of
gradient descent which is condition number,
log of initial level targets of optimality.
But the definition of condition number here is different
compared to what we had in the gradient descent is.
But with this new definition,
the convergence rate looks essentially the same.
If we consider the noisy case where y is equal to
x transpose w star plus a 0 mean iterative noise,
then there will be an extra term in
the convergence rate which depends
on the variance of the milestone.
So, sigma square, which we do not hear with sigma square.
So, let me try to summarize
all of the discussion so far in a table.
>> Theta is in the data?
>> What is theta?
>> Yes. It has some additional log factors
in whatever is inside the brackets.
So, summarizing all our discussion so far,
in the deterministic setting,
we saw that gradient descent has
the convergence rate of condition number log
of initial targets of formality,
so it depends linearly on the condition number.
We saw that there are
these acceleration techniques which can
improve by factor of square root condition number, right?
And the stochastic case,
Stochastic Gradient Descent has
again this kind of convergence rate,
which is the sum of the lifeless part and then
one thing that comes from the noise
and the broad question that we ask here is,
whether accelerating Stochastic Gradient Descent
in the stochastic setting possible,
just like what we were
able to do in a deterministic setting.
I need to clarify the question a little bit
more because it's well known that
the second term that we have here which is sigma square D
over epsilon is actually statistically optimal.
So there is no way no algorithm
can improve upon the statistically optimal rate.
There is information theoretic lower bonds
that's the best you can do.
So when we talk about accelerating
Stochastic Gradient Descent what we mean is,
can improve this first term where,
for instance, we have a linear dependence on copper,
can we perhaps improve it to
a square root copper for instance?
Okay, yeah.
>>But here I could just use the direct methods
to jump directly to the solutions,
you mean optimal in the class of gradient first order?
>>Yeah, yeah optimal in the class of first-order methods.
>> So, what about second-order methods?
>> Second-order methods- even within
first-order methods if you're not
interested in a streaming algorithm,
you could just take the entire co-variance matrix
and try to invert it again
using maybe a first-order method.
But we are looking at streaming
algorithms which are actually
what we used in practice. Yeah.
>> You could also use conjugate for gradient [inaudible].
>> Again, you can use that on
the empirical loss for
the streaming problem conjugate
gradient methods I don't know,
they have not been studied for
the streaming kind of thing.
For the empirical loss we could do
either second-order methods or conjugate gradient,
any of the things, yeah, okay?
So, before I try to answer this question,
let me try to convince you why this question is worth
studying and the reason is
that in the deterministic case firstly,
we saw that acceleration can actually
give orders of magnitude improvement
over an accelerated methods and there is
reason to hope that the same thing
might be true even in the stochastic case.
In fact, even though we don't really know of
a accelerated method in the stochastic setting,
in practice people do use the stochastic heavy ball and
stochastic nesterov that I just
mentioned in training deep neural networks.
So much so that in fact if you look at any of
the standard D planning packages
like Pytorch,or TensorFlow,
or CNTK, we'll see that
SGD actually means SGD with momentum.
So, there is this default parameter of momentum and
as it is usually run with
this additional momentum parameter.
Even though we don't really understand what
exactly it's doing and how it's helping,
if at all it's helping and so on, okay?
So, in the practical context
people are using these things but at
the same time we don't really
have a full understanding of
whether this even makes sense in the stochastic setting.
Finally, in terms of related work on
understanding acceleration in the stochastic setting,
there are some works by Ghadimi and Lan in
particular which try to
understand whether acceleration is
possible in the stochastic setting.
But they consider a different model compared to
what I introduce here as the machine learning model.
At a high level what it means is that,
they assume additive bounded noise,
rather than sampling one element at a time or mini
bytes at such a time and these
are completely different models.
I'll be happy to go over it in
more detail if you're interested offline about this,
about the differences in the settings here.
So, let me now jump to an outline of our results
and I'll present it as the questions that we
asked and what we get as an answer.
So, the first question that we asked
is whether it's possible to improve
this linear dependence on the condition number
to a square root condition number,
for all problems in the stochastic setting.
It turns out that the answer to this is no.
There are explicit cases where
it's not possible information theoretically,
to do any better than the condition number.
So, the second question would then be
whether improvement is ever possible.
Let's say for some easier problems,
maybe improvement is possible and the answer to
this is actually subtle
and it turns out that it's perhaps possible,
but it has to depend on other problem parameters,
it cannot just depend on the condition number.
Then the third question that we asked is whether
existing algorithms like stochastic heavy ball
and stochastic nesterov has accelerated gradient.
These are actually being used in practice,
do they achieve this improvement whenever it is
possible on problems where it is possible?
Surprising answer to this is that they do not
and in fact there are
problem instances where they are no better than a SGD,
even though improvement might be possible.
Finally, the question we ask
is whether we can design an algorithm which
improves over Stochastic Gradient Descent
whenever it is possible?
And we do design such an algorithm which we call
accelerated Stochastic Gradient Descent or SGD for short.
So, let me try to go
a bit more in detail into each of these things,
are there any questions at this point?
So, let's first try to
answer the first question whether acceleration is always
possible in the stochastic setting
and we'll try to do this with examples and
the example that we consider will be
the noiseless case where Y is exactly equal to
X transpose W star for some vector W star.
The question we are asking is
whether we can improve the convergence rate of
Stochastic Gradient Descent from a linear dependence on
condition number to maybe
a square root on the condition number?
For this, let's consider a discrete distribution.
So, here we said Y is equal to X
transpose W star and W star is some fixed vector,
so all I need to specify is
the distribution of X and here I'm just
specifying the distribution on X so let say
X is a two-dimensional vector.
One zero with very high probability
0.9999 and zero one with very low probability 0.0001.
In this case, an easy computation tells
us that the condition number is one
over the minimum probability which is ten to the four.
The question we are asking is if you can get
a convergence rate that looks
like square root condition number,
our equal entry if you can have
the initial error using
about square root condition number samples,
are about 100 samples, right?
In this case, it turns out that
it's actually not possible and this is
not too difficult to see because if we
sample a bunch of samples from this distribution,
unless we see about the condition number of samples,
will not see the second direction,
which is the probability ten to the minus four.
With fewer than ten to the four samples,
then we don't see anything in this zero one direction,
any X in the zero one direction.
So, there is no way we can
estimate the second coordinate of
W star without looking at the zero one vector.
So, with fewer than condition number of samples,
we cannot actually estimate anything about W one star.
So, acceleration is not possible in this setting.
So, this answers our first question in the negative,
that there are some distributions where
acceleration may not just be possible.
Information territory, yeah.
>> So, your second quarter to W
one star is one of a kappa anyway.
>> Yeah. So, it depends on-
>> [inaudible] less than one of a kappa?
>> Yeah.
>> You can always estimate it with one sample?
>> Yeah, yeah. So, it
depends on what is your noise level.
So, if you are above noise level,
then you don't care about that coordinate.
If you're below that noise level when you do want to get
that right, you care about it.
>> It's usually [inaudible] epsilon level on your desired accuracy.
>> Yeah. So, it does depend on epsilon.
However, when I'm talking about the rates here,
this rate- if I give you a certain rate,
either you have to write it in terms of,
till this error I get this rate,
till this error I get a different rate.
Or if you're trying to write a universal rate,
you do want to understand what is
the real behavior of
the algorithm, yeah that's all I'm saying.
>> Dependence on epsilon is not one
[inaudible] which is not of dependency here.
>> Can you, can I-
>> You do have dependency on epsilon in your formula?
>> Yeah.
>> [inaudible] epsilon.
>> Yeah, yeah.
If you're allowed to depend on epsilon as well,
then the dependency will be something
good till you want to get one over kappa accuracy
and something different after
you- if you want to go below
on all kappa. Yeah.
>> Did you say the second dimension
matters a lot less than the first dimension?
>> Yeah.
>> Because it's still rare to ever see that direction.
>> Yeah.
>> But why isn't observing the first dimension enough to
get very high or a very low [inaudible].
>> Yeah. So, before I was saying so,
if you only care about error
which does not carry you- okay,
so one thing you can think of is you
could be so far away from
your optimal in the second direction
that even with the low probability when you
take an expectation, it could actually matter.
>> Yes. If the loss is super large.
>> If the loss is super large.
But the point I'm trying to make here is that,
if you only care about a certain level of accuracy,
then you don't care about the second coordinate.
If you really care about the entire spectrum of rate,
then you do care about the second one as well
because eventually, you want to get everything right.
This example seems cooked up,
but the point of this example is to illustrate the issues
that might arise when you think of acceleration
when you go from
a deterministic setting to a stochastic setting.
So when I go to maybe empirical evaluations,
things may be much clearer.
But at this point, this example is more
to illustrate what are the things that
would just be completely different between
deterministic and stochastic settings. Yeah.
>> If for example the loss is
found in the classification-
>>Yeah.
>> -then this will not be an issue.
>> So, this may not be an issue. I mean
you could also think of for
instance where one dimension is actually super large.
But there are a lot of other dimensions
which together matters a lot.
But then you're still restricted in
your step size because of the single larger direction,
where this might still be an issue, right.
Okay, so this basically
demonstrates that you cannot
expect this acceleration phenomenon
to always magically work out in
the stochastic case and
the second question then we would ask is
whether it's ever possible for
this acceleration phenomenon to still
help you in the stochastic setting right.
And for this we again take a two-dimensional example
where the vectors come from a Gaussian distribution
and the covariance matrix is diagonal
with 0.999990 and 0.0001
same values we had earlier and even in this case
the condition number turns out to be
about 10 to the fourth, Right.
However because this is a Gaussian distribution,
if you just observe two samples,
with probability one they are
going to be linearly independent.
Right? And once you have
linearly independent samples you can just
invert them to exactly find W star.
So no matter how large the condition number is,
just after two samples,
you are able to exactly find W star.
And in this setting the suggest that
acceleration might be possible,
at least from an information theoretic point point view
or statistical point point view.
So, for the second question we see
that it might be possible but it
has to depend on something else
and not just the condition number.
Okay. So, what is
this quantity that actually
distinguishes the previous two cases
of the discrete and Gaussian distributions?
And the answer is basically the number of
samples that required for the empirical covariance matrix
which is this one over n summation x i
transpose minus the actual covariance matrix H.
So as long as
our empirical covariance matrix is close to
the true covariance matrix we are okay and
the important quantity here
is what is the number of samples that
you need for this to be the case.
For scalars, this is very well
understood and it's just the
variance of the random variables
that you are considering and for
matrices there is a similar notion called matrix
variace which in this context
we denote by kappa tilde and
call it Statistical condition number,
this is the quantity which determines how
quickly the empirical covariance matrix
converges to the true covariance matrix right.
And this quantity is what really
distinguishes the previous two cases.
So how does this actually relate
to the actual condition number?
So if we recall
the competition condition number or
just the condition number which is here on the right,
is just the, let's say
the maximum norm of x square or the support.
We can relax this but for simplicity let's
say the maximum norm squared over the support of
the distribution divided by
the smallest Eigen value of
the covariance matrix so
that's the definition of condition number.
Whereas the statistical condition number is here on
the left for an appropriate orthonormal basis
e this is basically
a weighted average of the mass on the direction e divided
by the expected value of e transpose x i whole square.
So the expressions may look a little
complicated but once you have these
it's actually not too difficult to see
that the statistical condition
number is always less than or
equal to condition number because on the left you have
some weighted average whereas on
the right you have a minimum in the denominator.
That's basically the intuition
and this can be formalized and acceleration
might be possible whenever
the statistical condition number
is much smaller than condition number.
Okay? Because statistical condition number
we're thinking of it as an
information theoretic lower bound.
So, if kappa tilde is pretty much the same as kappa.
Then there is no acceleration to be had.
Whereas if kappa tilde is actually
much smaller than kappa there may be
scope for accelerating the rate
of Stochastic Gradient Descent.
>> What's the sum of E.
>>Yeah so, this was
a little complicated so I tried to simplify it.
So E is basically
the eigen vectors of the covariance matrix,
of the covariance matrix H, population
covariance matrix which.
>> So there is at most D.
>>Yeah this is this is like
D Gaussian random variables
here whereas there could be arbitrarily large.
Okay so, If you compare what
happens for the discrete and Gaussian cases.
So for the discrete case it turns out that
both statistical and compositional condition number
are about 10 to the four which
are essentially the same and so that's why
we see that acceleration is not possible in that setting.
Whereas in a Gaussian setting, with some constants,
so the statistical condition number is of the order of
about 100 whereas the actual condition number
is about 10 to the four.
So there is a significant difference between
the statistical and computational condition numbers
and this was the reason
why discrete case acceleration
is not possible and where in
the Gaussian case there is still
hope that acceleration might be possible.
Okay and if we plot this,
plot the performance of gradient descent for both of
these distributions so the left
is discrete and the right one is Gaussian.
The green curve corresponds to the error of
Stochastic Gradient Descent or
several iteration and we see
that the point where the errors starts
decays pretty much the same for both of
them and it's about the condition number.
So after the condition number of samples it
starts to really decays geometrically.
Whereas the statistical condition number in
both these cases is very different
for the discrete distribution there is
not much difference between kappa tilde and kappa,
whereas for the Gaussian distribution there is a lot of
difference between the statistical condition number
which is the information theoretic lower bound to
the place where Stochastic gradient descent
actually starts doing very well.
Right so there is a large gap between the pink line and
wear green line starts could
do very well for the Gaussian distribution.
Okay. So this next question would then be whether
existing stochastic algorithms such as
stochastic heavy ball or
stochastic Nesterov accelerated gradient,
do they achieve any improvement
in settings like this Gaussian setting
that we saw before where there seems to
be scope for acceleration,
and the surprising answer to this is
that actually we construct
explicit distributions and fairly natural distributions.
Not cooked up distributions where there is a large gap
between statistical and actual condition number
but the rate of heavy ball,
the stochastic heavy ball method is no better
than that of Stochastic Gradient Descent.
So in the stochastic setting,
even though there is scope for
improvement on certain problems,
stochastic heavy ball does not achieve this improvement.
We can show this rigorously only for
the heavy ball method but
the same intuition seems to be true,
same results seems to be true empirically even
for stochastic Nesterov accelerated gradient as well.
So what I mean by that is that if you look at,
so this is the lower bound,
I would hope that green
one is Stochastic Gradient Descent,
I would hope that an accelerated method can do much
better than Stochastic Gradient Descent but it turns out
that the stochastic heavy ball
which is in blue and stochastic master which is in black,
they basically are on top of each other
and they don't do much better
than Stochastic Gradient Descent.
So, these momentum techniques do not really helping
the stochastic setting here. Yeah?
>>So if you did this termnistically
we would expected to have
dip along this purple line or?
>>Yeah. So you can use any of the offline methods right,
then using these number of samples is
sufficient to start decaying the errors. Yeah.
>> Is this with or without?
>>This is this with
nice but we have
similar plots for with nice but then there
is an error floor which is like 1 over n.
Yeah so this is without nice.
>>So the dip should happen at two samples?
>>Okay. Yeah, so I
haven't plotted the direct methods here,.
>>Yeah.
>>Direct methods, these two things are decoupled, right?
So there is how many samples do you take to construct
your empirical loss function and how
many iterations of your direct method that you run.
So, there are like both of these are
decoupled for direct methods.
>>You can just check answers for the last question is,
if you did batch method,
direct method or whatever on two samples will it work?
>>Yeah. On two samples for the Gaussian distribution
it should just work. Yeah, exactly.
But number of iterations it will
take might still be larger.
Because the empirical covariance matrix
might still be very ill conditioned.
>> [inaudible]
>>Then if you completely
if you use a matrix inversion then it will be this quick.
If you again use a first-order method to solve
empirical loss function that could
take some while because
the empirical function is not very well conditioned.
Yeah, that's all I meant. Okay? So the point
of this plot is that
the stochastic momentum methods
in the stochastic setting do not really
provide the improvement that we expect them to
provide based on
our intuitions from the deterministic setting.
Okay? So finally, so the question
is whether we can actually design
an algorithm which actually gives these kinds of
improvements in the stochastic setting and
whenever such improvements are possible.
Right? And the answer to
this actually turns out to be yes and we design
an accelerated Stochastic Gradient Descent method.
We get the following convergence rate.
So, the second term which is sigma squared
d epsilon is the part that comes due to nice,
and the first part was what
we were looking to improve for
compared to Stochastic Gradient Descent
which has a linear dependence on kappa,
our method has a dependence on
square root of kappa, kappa tilde.
And we already saw that
this kappa tilde is always less than equal to kappa.
So, this result is
always better than the condition number.
Okay? And that could be much better
when there is a large gap between kappa tilde
to the line kappa.
So, what does this mean in terms of plot?
Note that the X-axis is in the log scale.
So, the green one is Stochastic Gradient Descent,
the blue and black which overlap are
the Stochastic Momentum Methods,
and the red curve is our algorithm.
At least in this case, we see that there's
about- in order of magnitude,
improvement in the performance
of our method compared to that of
Stochastic Gradient Descent or even
the Stochastic Momentum Methods.
We should also note
that we are still far away from
the lower bound which is about 100 here,
and we believe that while
information theoretically 100 samples might be sufficient
for computational methods or streaming
computational methods which are
based on these first-order information,
we believe that it may not be possible to
do better than whatever our result is,
but we don't have any proof of the statement.
This is just a conjecture at this point.
>> Is this still the 2D Gaussian example?
>> It's actually a 100-dimensional Gaussian example.
>> So coming back to your question
two samples not enough, right?
>> Yeah, in this case. So that's why
I put it at 100. Yeah.
>> [inaudible] is 100, right?
>> 100, yeah.
>> That's it.
>> This is the same graph as before so previously-
>> Yeah, all the graphs are the same thing. Yeah, yeah.
>> [inaudible] so in your methods,
you showed that the convergence rate is actually
a square root Kappa Tidle.
>> Yeah.
>> So that doesn't correspond to
the lower bounds that you would achieve.
>> Yeah. So there's still a gap
between the Kappa Tilde here.
Both theoretically and empirically,
we do see that there is this gap.
We don't know if it is possible
to achieve the lower bound
with a streaming algorithm like the one that we're using.
You can definitely achieve that by
using a batch kind of method,
but with the streaming method, it's not clear.
>> Sorry, my [inaudible].
What is the value of square root Kappa Kappa Tilde here?
Is it [inaudible]?
>> Thank you. About 1,000.
>> It's 1,000?
>> Yeah. So, let
me now introduce briefly the algorithm that we use.
Unfortunately, we ourselves don't have
too much intuition about what exactly it is,
but at a high-level, what I want to point out
is that in the Deterministic Setting,
you could write Nesterov's Accelerated Gradient
in several different ways.
The most popular one is
a Two-Iterate Version which has momentum interpretation,
and there are also
various other interpretations of
understanding what Nesterov's acceleration is doing.
There is another interpretation
which use a safe 4 Iterates and has it's
interpretational simultaneously optimizing upper bounds
and lower bounds for the function
that you are trying to minimize.
While all of these are
equal and in the Deterministic Setting,
meaning that when you use exact gradients,
they actually turn out to be quite different if you
use Stochastic Gradients
instead of Deterministic Gradients.
The algorithm that we analyze here is
basically Nesterov's Accelerated Gradient
in the four iterate Version,
and we again basically replace
the gradient with Stochastic Gradients.
In our paper, we explain this more
as how this relates
to a weighted average
of all the past gradients and so on,
and I'll be happy to talk more about it
offline if you're more interested in
knowing what exactly the algorithm is doing.
So for the next couple of slides,
let me try to give a high-level proof overview of
the result that we have here. Yeah?
>> [inaudible] computationally, is it comparable to-
>> Yeah, up to constant, it's the same.
Instead of two rows it'll be four rows.
Yeah. Okay. So, if
you recall our result- this is our result, right?
So there are two parts to the result,
one is the rate because of the noiseless part,
the other one is due to the noise term.
The proof also naturally decomposes into two parts,
one to bond the first term,
and the other one to bond the second term.
For the first term, we can just assume that we have
a noiseless setting and just analyze
what is the convergence rate in the noiseless setting.
For the second term, you assume that you
start at the right point and just understand
how this noise drives the algorithmic process.
So, these are the two different components of the proof.
The first part actually is fairly
straightforward once we figure
out the right potential function to use.
The main innovation that we had to do here was
that for the standard Nesterov's Accelerated Gradient,
there is one potential function that's actually used,
and then we realize that there is actually
a family of potential functions that you could use,
all of them work for the
Deterministic setting but only one
of them seems more suitable for the Stochastic setting.
So, if we shift the potential function
by the H inverse norm,
it turns out that this is
the right potential function and
analysis is actually fairly standard,
in that it follows the kind of analysis that
Deterministic Nesterov's Accelerated Gradient follows.
And then you get
this Geometric Decay Rate of one
minus one over square root Kappa Kappa Tilde,
and this is where the Geometric Decay of
square root Kappa Kappa Tilde comes from.
For the second term,
it turns out that such a simple analysis
actually does not work out,
in the sense that if
we try to use any potential function,
it seems to blow up our results by a factor
of dimension or a factor of condition number.
So, we had to do an extremely tight analysis of how
the Stochastic process evolves
when it's driven by the noise.
So, you can think of the algorithm as a process,
something that describes a process,
and noise as something which
comes in as an input at every time-step,
and you have to understand how
the algorithm behaves in this noise input.
Basically, if you think of two iterates of the algorithm,
which is say, WT and UT,
then we write Theta T to be
the co-variance matrix of these iterates,
and then we can write a linear system
that describes the evolution
of these co-variance matrices,
Theta T. If you do all of this,
you get this inverse formula where B is an operator
which takes positive semi-definite matrices
to positive semi-definite matrices.
And you want to understand what happens to
a minus B inverse applied to
this noise noise transpose matrix,
and this is still quite challenging
because B has singular values which are larger than one.
So, if you use any crude bounds,
we cannot even get a good bound on this one,
so we had to really
understand the eigenvalues of this operator B,
and then solve this inversion problem in one dimension.
And then combine different dimensions
together using bounds on
statistical condition number and condition number.
So, this is all I wanted to
say about the proof of the result.
So if we recap whatever we have seen so far,
we saw that for the special case
of Stochastic Linear Regression,
we can completely, critically understand
the behavior of Stochastic Gradient Descent,
Stochastic Momentum Methods,
and this new accelerated
Stochastic Gradient Descent Method.
We saw that while
conventional methods don't really
provide any improvement in the setting,
this new method seems to provide
significant improvement even in the Stochastic Setting.
So, the next question that we were trying
to tackle is whether this
has any relation to problems beyond
linear regression or this
applies only to linear regression?
So, we had a bunch
of conjectures or results
here and we tried to evaluate all of
them in the context of training neural network.
So, the first question that comes to mind when we go into
this new setting is that
people are actually using Stochastic Heavy Ball,
Stochastic Nesterov in practice,
and there also have been very influential papers
which argue that they actually
give improvements over Stochastic Gradient Descent.
And this is why people started
using these methods in the first place.
Whereas what we're saying here is that
these Momentum Methods don't really give
any benefit in the Stochastic setting.
So, both of these things are contradicting to each
other and what is
the reason for this apparent contradiction?
The reason here turns out to be Minibatching.
So, if you think of Minibatching,
it's actually a continuum
between completely Stochastic setting
and a completely Deterministic setting based on
the size of Minibatch that you actually choose.
What we are saying here is that in the extreme,
where the Minibatching is
one or a completely Stochastic case,
these momentum methods are not helping.
Whereas in the Deterministic case,
we know for a fact that
these momentum methods actually helped.
When you are using Minibatching somewhere in the middle,
it's conceivable that you are going to get
benefits purely because you are moving
somewhat closer to Deterministic Gradient Descent.
The point of our algorithm here however is that ASGD
improves over SGD irrespective
of what Minibatch size it is,
and for whatever Minibatch size if there
is anything acceleration that's possible,
it gives that kind of acceleration.
So in order to test this hypothesis,
we first trained a deep auto-encoder for
mnist with the smallest batch size of one,
so Stochastic Gradient on single examples.
It turns out that the performance of Stochastic Gradient,
Heavy Ball, and Nesterov are essentially similar.
There is nothing really to distinguish
the difference between these methods.
Whereas if we run
our Accelerated Stochastic Gradient Method,
it runs reasonably faster,
at least in the initial part,
compared to that of
these other algorithms even
with a small Minibatch size of one.
Going next to classification task
on CIFAR-10 using Resnet,
again, a small Minibatch size of
eight which is much smaller than what's used in practice.
If you use this Minibatch size of eight,
we again see that there is not
much to distinguish the performance
of Stochastic Gradient as
well as the Stochastic Momentum Methods.
So all of them perform reasonably similar.
Whereas if you compare our method with for instance,
Nesterov method here for example,
at least in the beginning phases it converges much
faster compared to the Nesterov method.
We only really think about
the initial phases because there
is this additional term of
Sigma Squared Theorem which I'll not talk about.
So eventually, you will reach in life 4
and then there'll be no acceleration in that part.
So, the acceleration that you can hope for is only in
the initial phases where you can
hope to get faster convergence.
>> Are you using the Gradient Rate Decay?
>> Yeah. So, that's why the drops are correlated,
but the learning rate's themselves have been
searched using hyperparameter search. Yeah.
>> How about the other processes?
>> So, in the previous thing, it was
just a constant learning rate, so there was no decay.
Here, the things that we search for our learning rate,
when to decay, and how much to decay.
All of this was based on
a validation set and then we use this,
and this are zero-one test error.
Yeah?
>> You just answered my question.
>> Okay. Yeah?
>> I see error when it convergence, after convergence.
>> Yeah, so here you can see it's about 90,
say eight percent, about
eight percent is the error that you get here.
It's not maybe fully state of the art,
but then we are also not using
state of the art network to do this.
>> There is no main difference between both assets?
>> Yeah, so the final
error there's not that much of a difference.
The final error seems similar.
>> So, you're saying, you only have time for 10 classes,
you should use ASGD?
>> Yeah. So, I mean that's actually my next thing.
So, this was for like small Minibatch size of it right?
So, if you use something like 128,
which is more reasonable to use in practice,
we again see that ASGD converges faster,
but pretty much with the same accuracy,
but it converges faster compared to Nesterov method.
Of course, I mean you're again seeing
these similar places to draw because that's
exactly where we decay the learning rate,
and, for instance,
if you only care about getting
say to 90 percent accuracy,
you can get there using ASGD about
1.5 times faster as compared to Nesterov's method.
This could potentially be useful for if
you are doing hyperparameter search and you don't
care about getting the optimal error
for every hyperparameter,
but only want to figure out what is
the rough ballpark the bearer is going to be. Okay?
>> There's a jump there, right?
It can't really predict how [inaudible] it's going to be after
hundred impulse and process initially [inaudible]?
>> Yeah, I mean if you if you only want to
see whether something is
extremely bad or it's reasonable,
this could help you, that's all I'm saying.
Yeah. So, okay this
brings me to the conclusion of my talk,
so to recap what we have seen in the deterministic case,
we saw that acceleration improves
the convergence speed of
gradient descent by factor
of square root condition number,
and we asked the same question
in the stochastic approximation setting,
where we saw that
stochastic momentum methods which are like
the conventional and classical methods
that are being used.
They do not achieve any improvement
or Stochastic Gradient Descent,
whereas, the algorithm that
we proposed which is accelerated
Stochastic Gradient Descent.
can achieve acceleration
the stochastic setting whenever it's possible,
and while our theoretical results have been
established only for
the stochastic linear regression problem,
we also empirically verify
these observations for in the context
of training deep neural networks,
and we also released the code for this algorithm in
Pi touch and if any of you play with neural networks,
we would encourage you to try out our algorithm,
and let us know what you observe.
So, just as a high-level point
about optimization in neural networks and so on,
I would like to point out that while
optimization methods are heavily
used when training neural networks,
a lot of the methods that are actually
used are not well understood
even in very benign setting such as convex optimization.
I spoke about momentum methods,
stochastic momentum methods even for
the special case of linear regression
was not really well understood,
and algorithms such as Adam,
RMSProp, there's been
some recent work on Adam, for instance.
So, there are a bunch of
these methods which are widely used in practice,
but even in very benign settings,
the performance is not really well understood.
it's important to do this because as
we saw just because people are using it,
it may not be the best thing to do.
As practitioners, we may not
have all the time in the world to try out
all possible combinations to figure out
what is the best thing to do, right?
In this context, stochastic
approximation from a TD point of
view provides a good framework
to understand these algorithms,
to see what makes sense,
and what doesn't make sense.
While classical stochastic approximation result
focus on asymptotic convergence rates
and so on would be really interested
here in the painting strong non-asymptotic results,
which give us like finite
time convergence guarantees and so on.
So, that's all I had to say. Thank you for being here.
>> Any questions?
>> Yeah?
>> [inaudible] graphs, what was the y-axis again?
>> Error.
>> What error?
>> Tester and the bell distribution.
>> So, [inaudible] how many zero minus f?
>> F star.
>> F star.
>> P minus F star.
>> [inaudible] P minus F star. On some
of the draws from the same distribution.
>> In this case, we can exactly conclude
the function value because you
know that it's a Gaussian distribution.
So, you know the covariance matrix, right?
So, it only depends on the covariance matrix. Yeah?
>> So, your new algorithm the accelerated [inaudible],
if you do that from the deterministic,
doesn't it have a correspondence
to some other known algorithms.
How do you come up with that?
>> Yeah, so as I said,
so Nesterov's accelerated gradient can be written in
multiple ways, the deterministic world.
The most popular one is
what has this momentum adaptation
which is what people use.
But, there are various other interpretations and
various other ways you could write the same algorithm in
the deterministic world and
our algorithm is stochastic find one of them.
Yeah. So, in stochastic world,
they're completely different even though
in the deterministic world, they are exactly the same.
>> So, your balun [inaudible] square root of [inaudible] was upper bound?
was that upper bound?
>> That's upper bound, yeah.
>> In the plot, the error wasn't going down around [inaudible]
>> So, that may be some other constraints.
>> So, you would expect it
to go down [inaudible].
>> [inaudible].
>> Yeah, so it hasn't hit the bottom,
so if you consider like 10 to the four as
the baseline for these methods.
So, right after 10 to the four, they curve down.
>> Is there a more quantitative way to [inaudible].
>> Yeah, so there was a more quantitative way
which is that ideally
you want to run this for different values of
condition number and see,
and then plot the line,
and see the slope of that line,
which also we have, it's
a little more difficult to explain in a talk.
>> I see.
>> Yeah.
>> Well, there are no more questions,
let's thank Amit again.
-------------------------------------------
Here Is Why Trump Is To Blame For Terrorists Potentially Getting 3-D Plastic Guns - Duration: 1:30.>> Soon, no conditions for guns,
aparpt lay.
This morning, the plan for ghost
guns are up online.
These are firearm components for
rifles and guns who can be
created by anyone at home who
has a 3-d printer.
There.
>> The president weighed in
saying I'm looking into 3-d guns
being sold to the public.
Already spoke to the NRA.
It uz doesn't seem to make much
sense.
It was the trump state
department that dropped the
lawsuit to stop the blueprints.
The Obama administration started
them saying the plans could be
downloaded by terrorists, and
beyond dropping the lawsuit, the
trump administration is looking
at changing the very rule the
designer was originally sued
under.
And to the second part of the
president's tweet, there's no
word why he would check in with
the gun lobby or why the
conversation would have taken
place.
Let's slow it down.
The NRA is saspecial interest
group.
He didn't say he spoke to the
department of justice or the
state department.
He's not speaking to any other
gun control groups, but the
president took to Twitter to say
I have reached out to the NRA.
So those of you who have
chanlted over and over, drain
the swamp, walk us through --
>> Why you would go to the
industry group.
>> Why would you contact the
largest lobbying group ever, the
NRA, on what to do about the
ghost gun.
-------------------------------------------
Kershaw Natrix Knife Review- New for 2018 Carbon Fiber and G10. Based on ZT 0770 - Duration: 10:21.Last week you may remember a video where I strapped on the old proverbial water skis
and prepared for a well supervised stunt that involves a large ocean dwelling creature made
entirely of teeth.
And if you don't get that joke, then congratulations you never watched Happy Days.
And I'm not even being sarcastic about that.
But we're back to knives and not the more expensive brand name melamine foam which I
should have bought way cheaper off Amazon.
And this one specifically is the updated Kershaw Natrix which you have been seeing show up
on all the internet knife reviews table tops for the past two months- including Austria's
finest gear reviewing Youtuber A Little Older, whose video about this Natrix I'll link
at the end.
But before we get into the real knife demonstration video art let's look at the dimensions of
this update to a knife I think they released last year.
A statement I did not research well enough.
Like the overall length and weight.
I don't research anything.
I mean I researched new computers today.
I can't afford any of those.
Blade size and cutting edge.
What do you research or do that is a waste of your time.
Before you answer remember if you don't have anything nice to say.
Handle size and grip area.
Then you might be a student loan servicer!
Spine thickness and handle thickness.
You know I've been out of college for about 16 years now.
Tallness and flipper tabs.
I think in 2024 I'll have those paid off.
I heard Youtube was lucrative so that's why I'm here.
The good news is though that the Kershaw Natrix with carbon fiber overlays is a good knife
for under $50- so if you also wasted money on an industry specialized for profit college
you too can own this knife.
Of course if you made the smart choice and went into say computer science at a traditional
college you might be able to afford the knife this is based off of- the Zero Tolerance 0770,
which is based off the 0777- which won an award of some kind.
According to the website.
PBR won an award.
Let's look at the blade shall we.
It features a drop point, that resembles a sheeps foot in a way depending on how you've
been drinking.
"that sheep has real nice feet" ok not that much.
It's covered in a titanium carbo nitride covering a blade made from 8Cr13Mov sometimes known
as D2 if you buy a cheap knife directly from China.
It has a decent sized cutting edge, and some fine spine jimping on the top- not quite like
a spyderco but close.
Deployment of the blade is handled by a flipper tab that is not assisted, but has a strong
detent like the kershaw Fraxion I reviewed last year.
These knives are kind of similar in spirit- the Fraxion being smaller but having a light
mostly G10 frame.
This one can be deployed consistently with minimal effort.
Most is the initial press, and pop it rockets out.
It's actually almost harder to keep it from deploying fully.
Lockup is handled by a sub fame lock... which is a patented Kershaw feature, but looking
at it in idiot terms it's a metal piece attached to the G10 with screws... so the knife doesn't
have to have a full liner- it reduces carry weight and keeps it a nice size blade and
handle.
One thing I noticed after a few hundred deployments is that the blade centering was off... the
pivot had loosened a bit, and after tightening it reentered itself.
The pivot is easy enough to tighten- and apparently has no locktite which is a bonus.
Detent is strong so it's not possible to fling it open downward gravity style.
Handle is as mentioned made primarily from G10 with no liners.
It has a back spacer so it's only partially open backed.
It has thin carbon fiber overlays to match the hood of your 4 door accord with the car
seat in the back.
For a light weight every day knife I don't see any reason to make a handle out of steel
or even titanium.
Pocket clip.
Deep carry short, tip up in right or left pocket because it's swappable.
how ever no tip down deployment is possible- which relatively few people should complain
about.
Note the word relatively.
And I just got this question recently from a new subscriber... my preferred carry when
I'm wearing pants tip up blade backward in my right pocket I am right handed.
Pocket clip has right amount of tension, I like the tip to be a little less aggressive
of an angle but that's a minor nitpick.
Comparisons.
We'll keep this short this video doesn't need to be 10 minutes.
I dunno maybe it will be.
First the Natrix.
Blue and black, I think the knife is nice looking, the handle is kinda comfortable
I like a grip area about 3.5 inches so I can move my hand around a bit.
A little smaller isn't bad I guess, not everyone needs large knives.
I own a few Kershaws and this one would be most likely to show up in a pocket rotation.
Now the Fraxion- this one is also very nice.
A coworker asked me once what a good small knife was, and I recommended this.
It's a fast deployer and you barely know it's in your pocket.
Not everyone needs handles and blades over 3 inches.
How about the Cryo 2, Hinderer designed.
A little too heavy for me, and not quite big enough I don't need an assisted knife, when
a non assisted like the Natrix is nearly as good.
My rule of thumb is, if the blade and handle are not the ideal size for me, and it's over
4 ounces I won't carry it.
Let's look at the Para Military 2... little bit larger handle and blade the ideal every
day carry size for me but a lot more expensive probably by $100.
Sometimes I have a hard time recommending a good light modern snappy deploying knife
that's affordable that's not a spyderco.
The Natrix fits that bill... it's sort of comparable to the Spyderco Tenacious in price
point and similar steel, but a little lighter and slightly smaller.
And I think nicer looking.
The tenacious is kind of plain in black G10 personally- although nicer after drinking.
Wrap it up.
The Natrix is a fine every day carry blade if you're on a budget.
Sure knife snobs don't like 8Cr13Mov steel, but to the person who doesn't mind sharpening
a bit more often- then it's a good budget choice.
I own many knives in a lot of different steels, but that's never been the main reason I choose
to carry a specific knife.
My last two larger knife purchases were both Spydercos and the looks, and handle materials
were my first considerations when spending money I didn't have.
Many of the knife designs I like, from the looks, to the ergonomics tend to have good
steel already.
And as a note of caution on the batoning of light weight knife designs.
They are not designed for this.
And it voids your warranty.
Plus you might injure yourself or a nearby drunk loved one.
The blade stop pin popped out because of the flex of the G10 and the repeated hard whacking.
Luckily my knife was easily fixed by partially disassembling the knife, and praying to Lynn
Thompson that I could find the stop pin to finish my review.
The Lord himself smiled down- he was wearing that black tie and dress shirt, surrounded
by 1000 flaming dismembered pig heads.. and I found it.
This of course is not a reflection on the construction of the knife- but a reflection
of my poor character and terrible sense of humor.
The testament is, I was able to fix the knife and return it's operation to normal boring
every day carry things.
Like cutting.
So if you like this sort of review, subscribe to my channel, give the video a thumbs up,
leave a comment.
I started my Patreon recently- you're like yeah I know I saw you groveling on Instagram-
I'll link it in the video description.
Signing up for a donation there helps me afford stuff to review for the channel when companies
don't answer my emails.
I feel like there's a chance for self reflection here to take some personal responsibility.
You know that thing you think everyone else should take.
Yes the problem is other people don't take personal responsibility.
That's it.
That's why I started a Patreon.
Although Kershaw did send me this knife to review- so that's cool.
Patreon also helps pay to maintain my camera and video making equipment.
If you're watching this video that means I was able to successfully upgrade my failing
hard drive in my 2011 iMac I bought in 2012 as a refurbished unit.
If you don't see this video in the last week of July- well some stuff went down and it
took longer than expected.
Also if you like occasionally giveaways and photos of knives follow me on instagram.
However if you don't want to join instagram for a crappy giveaway it's not my problem.
Thanks for watching!
-------------------------------------------
Cas Client : Dysfonctionnement de l'afficheur du lave-vaisselle - Duration: 1:36. For more infomation >> Cas Client : Dysfonctionnement de l'afficheur du lave-vaisselle - Duration: 1:36.-------------------------------------------
"je récupère mon ex" quelle est la méthode? - Duration: 12:27. For more infomation >> "je récupère mon ex" quelle est la méthode? - Duration: 12:27.-------------------------------------------
Le cancer n'est qu'un champignon qui peut être traité simplement avec du bicarbonate de soude - Duration: 11:10. For more infomation >> Le cancer n'est qu'un champignon qui peut être traité simplement avec du bicarbonate de soude - Duration: 11:10.-------------------------------------------
IFL East Highlights 2018 (long version) - Duration: 4:24.thank you to all the people who contributed to make this superb day
reality
primary secondary colleges have set their teachers from all around the
country to come to the International Festival of learning and she's been such
a diverse mix of people it's just so great getting so many brilliant minds
together at the same place all talking about education it really raises the bar
on bringing like-minded people together it kind of puts up an East of England
really on the map in terms of being a good forum to talk about education and
bring best practice together we've heard among Spearman Jeff Barton it's just
been sort of a high profile level of people and thought leaders and it's just
been absolutely fascinating and informative
I've been really impressed with the number of breakout rooms and all the
different subjects being talked about so the last session I was in was about stem
but we had a music teacher who was getting involved with the discussion I
think that really says a lot about the event you've attracted a lot of
different people here I've come here today and as well as very targeted very
focused sort of learning and knowledge acquisition from going into the seminars
there's been a great opportunity and a great amount of networking going on it's
a festival in the whole sense of the word so we had bunting out we have
mountain climbing on we'd had archery we have people eating food on the grass it
was brilliant it was just like so relaxed you think in eastern region an
event like this can have a big impact for teachers who are dealing with
difficult issues around stone they can come together and meet others who are
also doing the same issues talk to employers particularly those who are in
industries are employing in the region
it is giving the teachers and education leaders confidence confidence in what
they do and raise the aspirations of the community on what can be a change
it's about health and well-being and having positive health and well-being
that enables pupils to learn whose biggest resources that staff it's really
important they look after their stats well-being and make sure that their
stress level three to the minimum really
I'm a great believer in education that stimulates children just to be lifelong
learners for the future not just passing some temporary assessment framework that
we have in place in our schools at the moment but giving them those core skills
that make them curious people who want to learn for that whole life we
squatched Pro renewables and developing for wind farms off the course to be
sanguis our own group committed to ensuring that the future generations of
scientists and engineers are ready to work for those wind farms I want to fix
the one-size-fits-all delivery of education using innovative technology I
want to reduce teacher workload and our futures to focus on what they came to do
which is teach and the number one problem that I want to solve in terms of
education it's how we see the profession I no longer want to see teacher Fang I'm
just a math teacher I'm just an English teacher that person would have had a
profound impact on 10,000 villi which is absolutely incredible and so I have lots
of thoughts on education or what we can do to improve but I really think that we
need to face us on the front line on teachers on their delivery on their
well-being on Tecna how they're performing and helping them and also
obviously directly on the unit to the students been really great to kind of
make new contacts in business that you know two three years down the line I
could maybe go back to and say you know I had a conversation with you at the
International Festival and Here I am have you got anything you can offer me
for me making contacts in network you
-------------------------------------------
Warbringers: Sylvanas (Türkçe Altyazılı) - Duration: 4:00. For more infomation >> Warbringers: Sylvanas (Türkçe Altyazılı) - Duration: 4:00.-------------------------------------------
Max Verstappen defends his F-BOMB rant live on Sky Sports: 'Shame they bleeped it' - Duration: 3:03.Max Verstappen defends his F-BOMB rant live on Sky Sports: 'Shame they bleeped it' Verstappen lasted just six laps before he began to lose power and steering in his Red Bull car. The Dutchman was told to pull over by his team and he replied over the team radio with a series of expletives. "Mate, really? Can I not just keep going? I don't care if this f***ing engine blows up," he fumed. "What a f***ing joke, all the f***ing time.
Honestly. Argh." Verstappen's rage was censored before reaching the viewers but he wishes they would have been able to hear the full extent of his anger. "It is just, at the moment, difficult to accept," Verstappen added after the race. "I was very angry on the radio, I think there was a lot of bleeping out there, which was a shame that they bleeped it away because it would have been better if they would have allowed it but that is how it is." Red Bull have been seriously let down by engine providers Renault this year and they have already agreed to switch to Honda from 2019.
And he is delighted that at least there will he some light at the end of the very frustrating tunnel - although he will likely have to suffer an engine penalty first. "Yeah I think from both sides, both Daniel [Ricciardo] and me. It is honestly, not at all, how it should be," he said. "You pay millions as a team for, you hope, a decent engine but it keeps breaking down. Except that but we are also the slowest out there.
"I felt good with the car and had a strong start but the race was then over within six laps. "It is really frustrating after putting all the effort in and being in a promising position, but then having to stop due to reliability. "As I was happy with the car I think we could have had a good battle with the front group, it's a shame to have missed out on that and some valuable points.
"It's such a shame for not just myself and the team but also the fans that travel all the way here supporting me. "It's not fun to watch me complete a few laps and then retire. I'm not sure if this will mean engine penalties for Spa, we will look into it as a team and discuss the best way to come back strong after the summer break. "I don't really feel like going away on holiday now as this isn't the way I wanted to finish the first part of the season. "I would like to get back in the car to race again and finish on a strong result, unfortunately I can't.".
-------------------------------------------
4'7 M3T3R D0WN*F;u;1;1^M;0;v;i;3*2o!7" - Duration: 1:22:27.
-------------------------------------------
MG F 1.8I 120pk, 127.dkm! Lmv, windscherm, Elek pakket, Radio/cd! - Duration: 1:11. For more infomation >> MG F 1.8I 120pk, 127.dkm! Lmv, windscherm, Elek pakket, Radio/cd! - Duration: 1:11.-------------------------------------------
Montana Made: Holter Heart Monitor - Duration: 2:35. For more infomation >> Montana Made: Holter Heart Monitor - Duration: 2:35.-------------------------------------------
How to care for Keyboards - Duration: 1:35.(piano music)
- Hi, this is Tanya here.
I'm so glad you've taken the opportunity to borrow
one of our portable keyboards from the library,
but I just want to share some tips
on how you can take care of it before you start.
If the keyboard didn't come with a cover,
make sure you put a light cloth over it
when it's not in use because this will prevent
any dust layers from forming on its surface.
Typically your keyboard will be pretty clean,
but if you need to wipe it down,
use a soft cloth not a paper towel.
Slightly dampen it with a mild mixture
of one to two drops of dishwashing soap and warm water.
Don't use any cleaning chemicals or solutions.
Wipe each key individually towards you starting
with the white keys and then the black ones,
then use a second clean cloth to wipe them dry.
One of the biggest and most common mistakes
people make with keyboards is spilling things
all over them.
So make sure you don't leave your drinks or your food
where it could accidentally spill onto the keys
and damage the electronics.
And needless to say, make sure your hands are clean
before you play.
Lastly, avoid exposing the keyboard
to extreme temperatures and if the keyboard is
on a keyboard stand, make sure it's properly assembled.
That's it.
You can find out more about proper care practices
and playing tips by going online
or contacting a music instructor.
Take care.
(piano music)
-------------------------------------------
Google Pixel 3 may finally get this major missing feature on the Pixel 2 ● Tech News ● #TECH - Duration: 2:18.GOOGLE'S Pixel 3 continues to generate excitement and new leak could reveal that wireless charging
is finally coming to search giant's flagship phone.
The Google Pixel 3 looks set to bring a swathe of new features to this hugely popular smartphone
brand.
Rumours are rife that Google is likely to include a faster Qualcomm Snapdragon 845 processor,
improved camera and better battery life.
It's also thought that the larger Pixel 3 XL will get a full edge-to-edge screen which
will cover almost the entire front of the device.
In fact, the only part of the Pixel that won't be filled with this display is a small notch
at the top and slight chin at the bottom of the phone.
Now another new leak may reveal that the latest Pixel is getting something many were hoping
would appear on the Pixel 2 - wireless charging.
9to5Google is reporting that the alongside the updated phone will be a new accessory
called the Pixel Stand.
This dock is thought to be able to add power to the Pixel 3 without the need for wires
and that's not all.
Once placed on the Stand users may also set seamless Google Assistant integration even
when their phone is locked.
The Pixel 2 doesn't include wireless charging and that puts it behind some key rivals including
Samsung, LG and Apple.
Adding this new way of refilling the device would clearly be a popular move and would
certainly make a sense as it will future-proof the phone.
This latest leak comes as a new picture was recently released on the web which claims
to show how the larger Pixel 3 XL in all of its glory.
Gizmochina says it has managed to see images of a set of CAD drawings detailing how the
phone will look inside a case.
Although Google seems to be going for a single camera on the back, there could be the inclusion
of a double front-facing system which may add some DSLR-style depth of field to selfies.
Two final features revealed in the image is the space for a stereo speaker at the base
of the phone and a rear-mounted fingerprint scanner in the middle of the device.
-------------------------------------------
Oven Roasted Broccoli and Cauliflower || Keto AND Kid Friendly - Duration: 1:10.Grilled Vegetables
Broccoli and Cauliflower
Cut into florets
Olive oil
Salt
Pepper
Garlic powder
Toss lightly
400 °F for 20 minutes
-------------------------------------------
Curious Quail - Impostor Syndrome - Duration: 3:58.[Tasty 5/4 drums and Gameboys bring us in]
[Gameboy bleeps and bloops dancing with guitar notes]
🎶The day was off to a good start🎶
🎶A lead, a sale, a raise🎶
🎶but brain fills with intrusive thoughts🎶
🎶And all good things erased🎶
🎶Someone compliments your work🎶
🎶It makes you hesitate🎶
🎶You know somehow that shoe will drop🎶
🎶And kill you while you wait🎶
🎶But Hold on🎶
🎶This isn't you talkin'🎶
🎶For so long🎶
🎶You were raised to believe that your brain can't be wrong🎶
🎶but it's so wrong🎶
🎶You gotta stop listenin'🎶
🎶And find your worth🎶
[Ascending series of noises on Gameboy and Guitar]
🎶Every win comes with a shock🎶
🎶Time and time again🎶
🎶Ticking borrowed time till you're🎶
🎶abandoned by your friends🎶
🎶Cuz someday they will find you out🎶
🎶Someday they'll see you🎶
🎶For this bed of trash you are🎶
🎶Who cannot follow through🎶
🎶You cannot follow through🎶
🎶But hold on🎶
🎶This isn't you talkin'🎶
🎶For so long🎶
🎶You were raised to believe that your brain can't be wrong🎶
🎶but it's so wrong🎶
🎶You gotta stop listenin'🎶
🎶And find your worth🎶
[The guitar is really just ornamental at this point]
[Chillout Gameboy breakdown time]
[Joey turned off his snare all smooth af]
[Wurlitzer; better late than never]
🎶Block it out🎶
[Let's accent that Wurlitzer part with some delay]
🎶You need to somehow🎶
🎶And when you're down🎶
🎶Know that there's only you and you're good enough🎶
🎶And you've earned all these things that you're good at🎶
🎶There's only you and you're good enough🎶
🎶You'll be the best you, you're good enough🎶
🎶Wait and see🎶
[Ahh yes, a rest]
🎶Wait and see🎶
[HIT IT JOEY]
[This is Mike's second guitar solo of the year, it's getting out of hand]
[Unlike in previous songs, we actually DID need this many guitars]
[Closing it out with some final guita...]
[Just kidding, Gamebo...]
[DAMNIT JOEY]
[Thank you for watching!!]
No comments:
Post a Comment