Re: help with DB slowness

baselzind · ‎07-14-2024

The application team are saying this statistic proves that the application slowness is network related but i need help if someone can understand it and confirm it as i can see average wait time is 3ms which isn’t slow

marce1000 · ‎07-14-2024

- FYI : How to Ask The Community for Help
- You are not even mentioning a platform/model/software version this is occurring on .

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

baselzind · ‎07-15-2024

most of the DB are on a two C9410R switches connected in stackwise

MHM Cisco World · ‎07-15-2024

It hard to know what issue here but you mention stackwise virtual so I will first thing check if traffic is pass SVL or not

Are client and server conect to 9400 via PO are the hash of PO is same ?

Are the one of PO face issue in one port member?

MHM

Joseph W. Doherty · ‎07-14-2024

Ah, but it's an average wait time. And, even 3 ms can be slow if you doing lots and lots of them.

It's possible the network is "slow", but in my experience, often the underlying problem is the developers are going to and from the "well" much more often then they really need to.

Laugh, many years ago, I interviewed for a DBA position at the Federal Reserve in DC. One interview question was, what can you do to improve SQL performance. I answered, teach your SQL developers how to optimize SQL performance.

Anyway, as an example, you might want to read this BLOG posting. Toward the end, the result of "tuning" how SQL was being used was: "This was significant performance improvement. From 1 hour 22 mins to 2 mins 55 secs !"

Understand, cannot say what the root of your problem is, and your network may indeed be the bottleneck, or a contributor to slow performance, but even when the network is the problem, unless the network is really, really (really) bad, improving it, even to the best possible, doesn't often make a huge difference.

Oh, BTW, network latency can be a real performance reducer, but is some cases, the "best" is still bad. For example, clients accessing servers on the other side of the world (literally) have much distance based latency, but we cannot make electrical or optical signals travel faster (and more bandwidth doesn't help).

The keys to fast SQL performance, don't transmit data across the network if you don't really need to, but if you do need to, don't do it in many partial operations.

(BTW, an example of the impact of partial data transfers, every use tftp across a WAN vs. a LAN? See the WAN be much slower, and not even coming near using all available bandwidth? An example of not sending data if you don't need to, ever compress data to reduce transmission time?)

baselzind · ‎07-14-2024

but please what does the screenshot means? what are the wait times? couldn't the wait time be a result of the DB taking too long to respond?

Joseph W. Doherty · ‎07-15-2024

@baselzind wrote:

but please what does the screenshot means? what are the wait times? couldn't the wait time be a result of the DB taking too long to respond?

I believe (?) the wait times are, indeed, the time the client is waiting for results from "dblink".

3 ms shouldn't be very noticeable, but its also noted as being an average. Which can easily mean, every now and then, it's longer, possible much longer, but the average is low due to many good responses. Usually even a few very long responses will often generate complains that response times is always bad, because that's what's remembered.

What you might ask for, assuming such delays are seen on clients, have the client run a continuous ping test, the same SQL server(s), and compare those numbers with the app's response time. If the pings have good time, the delay is not the network between client and server.

However, more than one SQL server might be involved. If so, have continuous pings run between them.

BTW, understand, slow pings between hosts might reflect the hosts also being slow processing pings. So, if those pings show a delay, next step would be to place dedicated ping hosts on the same network segments as the end hosts, and measure those ping times.

You might also check all host access ports for drops. Such would indicated transit congestion on the edge ports, and drops can very much slow network throughput. Also good to likewise check all network transit ports for drops. (IMO, drops is a much better indicator of network congestion issues than any load percentage stat.)

baselzind · ‎07-17-2024

thanks for the input , but from what i understood the application is in the DMZ zone and the application have two DB in the inside and when those two DB pull information from further two more DB in the inside , that is where the slowness occur which is over 10 (sometimes up to 20 ) seconds to open something , so should i do a ping to the application and the FOUR DB in question?

Joseph W. Doherty · ‎07-17-2024

". . . should i do a ping to the application and the FOUR DB in question?"

Worth trying.

That said, I would wonder, again, how well the application mixes and mashes between 4 (!) databases.

Also, if databases are internal and app is running across a FW, that could be a potential bottleneck.

BTW, it would be interesting to know if the problem app (function) and/or its prod (volume of) data is relatively new, i.e. coinsides with the performance issue. SQL can easily run into scalability issues. I.e. worked perfectly fine during development but not in prod.

Flavio Miranda · ‎07-17-2024

@baselzind

Run a packet capture on the server or on the path between source and destination and share the .pacp file here. Make sure to simulate the problem while you are capturing the packet and inform which is the IP source and the Destination.

baselzind · ‎07-17-2024

according to the application guy the DB has two DB and these two pull information from two DB and he wont specify which DB is the one related to the slowness , so how can i packet capture from 2 sources and 2 destinations? also the problem happens randomly and goes away!

Joseph W. Doherty · ‎07-18-2024

BTW, have you checked all possible network transit devices for interface drops and CPU spikes?

Potentially, IF the problem is a network issue (which is a possibility), it would likely be due to tempory ("happens randomly") traffic congestion. Such issues are often invisible to typical network monitoring because they are so transient.

As to packet capture, possibly target all the involved host access ports.

Flavio Miranda · ‎07-18-2024

Depending on where you place your capture you can take traffic for both. But, if the problem is not reproducible then it would be very hard to identify the problem using capture packet.