cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1218
Views
0
Helpful
5
Replies

TimeOutException

rodica.pastusac
Level 1
Level 1

Hello Vikram,

Yesterday and last week I didn't have problems with RuntimeException. And I thought that finally everything is fine, but I lanced today my application 3 times for the same Delta file. In 2 of them the DB crashed.

We still have the problem with TimeOutException.


Here the answer I got from the analystes:

Perhaps Vikram could point to you a peace of code that we could use to test for latency ?

Or can you create a peace of code to test the latency that we could run on another server in a different environment?

Do you have a test for latency?

And I received today a new Exception: UnavailableException, errorMessage: Not enough replica available for query at consistency LOCAL_QUORUM (2 required but only 1 alive).

Usually when this exception happens it indicates that is an insufficient replication factor in the db. In our case where is the problem?

So the first run I got:

12:08:39.597 [Thread-13] DEBUG c.c.t.c.ContextEncryptionService - Search by field values is called for </context/context/v1/search> with  parameters: {wg=[production], op=[AND], piiElementHashes.IA_Company=[35c08c984e664772f702bc3e7729bbb8af4d497b7d7f3fcf21231d772604a26e], piiElementHashes.IA_Num_Client=[46301399d645648c85702e179b1c6f9d29246affc60c28fd4867a6d3e369d94b], type=[customer]}

12:08:39.667 [Thread-41] INFO  com.cisco.thunderhead.RESTClient - ERROR HTTP STATUS = 400

12:08:39.687 [Thread-41] INFO  com.cisco.thunderhead.RESTClient - Error on CREATE: https://context-service.produs1.ciscoccservice.com/context/context/v1

com.cisco.thunderhead.errors.CassandraDatabaseRestApiException: java.lang.Exception: {"errorType":"databaseDriver.exception.queryExecution","errorData":"UnavailableException","errorMessage":"Not enough replica available for query at consistency LOCAL_QUORUM (2 required but only 1 alive), trackingId: a3495374-78e5-418c-82c1-abcf5e55c9b5"}: RestApiError with errorType: databaseDriver.exception.queryExecution, errorData:UnavailableException, errorMessage: Not enough replica available for query at consistency LOCAL_QUORUM (2 required but only 1 alive), trackingId: a3495374-78e5-418c-82c1-abcf5e55c9b5

                at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_101]

                at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_101]

                at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_101]

                at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_101]

                at com.cisco.thunderhead.errors.ApiExceptionFactory.getApiException(ApiExceptionFactory.java:78) ~[context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.errors.ApiExceptionFactory.generateApiException(ApiExceptionFactory.java:113) ~[context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.RESTClient.createExceptionFromErrorString(RESTClient.java:841) [context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.RESTClient.throwApiException(RESTClient.java:858) [context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.RESTClient.create(RESTClient.java:479) [context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.RESTClient.create(RESTClient.java:435) [context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.RESTClient.create(RESTClient.java:534) [context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.client.BaseEncryptionService.encryptAndCreate(BaseEncryptionService.java:134) [context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.client.ContextServiceClientImpl$1.execute(ContextServiceClientImpl.java:366) [context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.client.ContextServiceClientImpl$1.execute(ContextServiceClientImpl.java:362) [context-service-sdk-extension-2.0.2-10384.jar:na]

                at com.cisco.thunderhead.client.ContextServiceClientImpl$ContextCallable.call(ContextServiceClientImpl.java:1077) [context-service-sdk-extension-2.0.2-10384.jar:na]

                at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_101]

                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]

                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_101]

                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]

                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]

                at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]

Caused by: java.lang.Exception: {"errorType":"databaseDriver.exception.queryExecution","errorData":"UnavailableException","errorMessage":"Not enough replica available for query at consistency LOCAL_QUORUM (2 required but only 1 alive), trackingId: a3495374-78e5-418c-82c1-abcf5e55c9b5"}

                ... 15 common frames omitted

And for the second:

12:33:37.484 [Thread-39] INFO  com.cisco.thunderhead.RESTClient - Error on UPDATE: https://context-service.produs1.ciscoccservice.com/context/context/v1/id/a93b57c0-85cc-11e6-bc9e-7d84b7a5ad80

12:33:37.487 [Thread-39] ERROR com.cisco.thunderhead.RESTClient - Attempt to connect failed: type:timeoutRequest, data: path: /context/context/v1/id/a93b57c0-85cc-11e6-bc9e-7d84b7a5ad80, message: Request timed out: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out

                at com.sun.jersey.client.apache4.ApacheHttpClient4Handler.handle(ApacheHttpClient4Handler.java:187)

                at com.sun.jersey.api.client.Client.handle(Client.java:652)

                at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)

                at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)

                at com.sun.jersey.api.client.WebResource$Builder.put(WebResource.java:539)

                at com.cisco.thunderhead.RESTClient.update(RESTClient.java:666)

                at com.cisco.thunderhead.RESTClient.update(RESTClient.java:648)

                at com.cisco.thunderhead.RESTClient.update(RESTClient.java:705)

                at com.cisco.thunderhead.client.BaseEncryptionService.encryptAndUpdate(BaseEncryptionService.java:160)

                at com.cisco.thunderhead.client.ContextServiceClientImpl$2.execute(ContextServiceClientImpl.java:422)

                at com.cisco.thunderhead.client.ContextServiceClientImpl$2.execute(ContextServiceClientImpl.java:418)

                at com.cisco.thunderhead.client.ContextServiceClientImpl$ContextCallable.call(ContextServiceClientImpl.java:1077)

                at java.util.concurrent.FutureTask.run(FutureTask.java:266)

                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

                at java.lang.Thread.run(Thread.java:745)

I'm trying to understand how the Cisco DB is working to avoid such Exceptions, but I don't have success. Please, help me to understand where the bug is.

Thanks,

Rodica

5 Replies 5

rodica.pastusac
Level 1
Level 1

Our custom application is running on a Windows 2012 server.

context-service-sdk-2.0.1

slf4j-1.7.21

eclipse-inst-win64

Eclipse IDE for Java Developers

Version: Neon Release (4.6.0)

Build id: 20160613-1800

JavaSetup8u101

jdk-8u101-windows-x64

jre1.8.0_51

apache-maven-3.3.9

I sent the log file to Vikram;

Thanks

Today I lanced one more time the application with the same parameters for the same DB but in LAB_MOde=true and it worked with no Exception. The TimeOut Exception occured twice in few weeks of tests and we  have to figure out why and in what circonstances it happens.

Thank you,

Rodica

vchhabra
Level 5
Level 5

Hi Rodica,

Few things:

1) We found out that the CassandraDatabaseRestApiException exception took place while we were running a maintenance job our our cloud platform. Our platform team has taken the action item to make their maintenance window more seamless, so it doesn't impact end user actions, like this.

2) Regarding timeouts (like we discussed in your other thread Re: Multiple Update of the same customer?) - I suggest that you build logic in your custom application to retry upon timeout or failure. The reason is that you are accessing the service over public Internet (like any other SaaS). There will be factors beyond your or Cisco's control. The SDK has some retry logic built in to protect, but we've seen latency unusually high from your lab (in 4-5 seconds range per REST API call and some SDK calls are multiple REST API calls internally). If you are in an environment that is adding this latency (like a slow proxy), then your custom application should catch it and retry.

3) Over the next few weeks, we are going to add capability in the SDK that will expose latency metrics via JMX counters. You will be able to use them to measure the roundtrip time that you are seeing from your custom app. This will help you fine tune your app for your environment. We will update the doc on DevNet that will explain how to leverage them.

regards,

Vikram

Hi Vikram,

So for CassandraDatabaseRestApiException theoretically we should not encounter them anymore?

For TimeoutException I have a bout of code for catching the ApiException and retry one more time to create/update the needed Customer. Is that the solution that you propose? And for latency after few weeks I will be able to check it. That’s wright?

I want to be sure that I correct understood your solution.  Friday on 18/11/2016 we supposed to do the initial loading again that will take more than 30 hours. So I’m trying to get fix all the exceptions to be productive in the weekend.

Thanks a lot for your help.

Rodica