11-16-2016 06:13 AM
Hello Vikram,
Yesterday and last week I didn't have problems with RuntimeException. And I thought that finally everything is fine, but I lanced today my application 3 times for the same Delta file. In 2 of them the DB crashed.
We still have the problem with TimeOutException.
Here the answer I got from the analystes:
Perhaps Vikram could point to you a peace of code that we could use to test for latency ?
Or can you create a peace of code to test the latency that we could run on another server in a different environment?
Do you have a test for latency?
And I received today a new Exception: UnavailableException, errorMessage: Not enough replica available for query at consistency LOCAL_QUORUM (2 required but only 1 alive).
Usually when this exception happens it indicates that is an insufficient replication factor in the db. In our case where is the problem?
So the first run I got:
12:08:39.597 [Thread-13] DEBUG c.c.t.c.ContextEncryptionService - Search by field values is called for </context/context/v1/search> with parameters: {wg=[production], op=[AND], piiElementHashes.IA_Company=[35c08c984e664772f702bc3e7729bbb8af4d497b7d7f3fcf21231d772604a26e], piiElementHashes.IA_Num_Client=[46301399d645648c85702e179b1c6f9d29246affc60c28fd4867a6d3e369d94b], type=[customer]}
12:08:39.667 [Thread-41] INFO com.cisco.thunderhead.RESTClient - ERROR HTTP STATUS = 400
12:08:39.687 [Thread-41] INFO com.cisco.thunderhead.RESTClient - Error on CREATE: https://context-service.produs1.ciscoccservice.com/context/context/v1
com.cisco.thunderhead.errors.CassandraDatabaseRestApiException: java.lang.Exception: {"errorType":"databaseDriver.exception.queryExecution","errorData":"UnavailableException","errorMessage":"Not enough replica available for query at consistency LOCAL_QUORUM (2 required but only 1 alive), trackingId: a3495374-78e5-418c-82c1-abcf5e55c9b5"}: RestApiError with errorType: databaseDriver.exception.queryExecution, errorData:UnavailableException, errorMessage: Not enough replica available for query at consistency LOCAL_QUORUM (2 required but only 1 alive), trackingId: a3495374-78e5-418c-82c1-abcf5e55c9b5
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_101]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_101]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_101]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_101]
at com.cisco.thunderhead.errors.ApiExceptionFactory.getApiException(ApiExceptionFactory.java:78) ~[context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.errors.ApiExceptionFactory.generateApiException(ApiExceptionFactory.java:113) ~[context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.RESTClient.createExceptionFromErrorString(RESTClient.java:841) [context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.RESTClient.throwApiException(RESTClient.java:858) [context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.RESTClient.create(RESTClient.java:479) [context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.RESTClient.create(RESTClient.java:435) [context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.RESTClient.create(RESTClient.java:534) [context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.client.BaseEncryptionService.encryptAndCreate(BaseEncryptionService.java:134) [context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.client.ContextServiceClientImpl$1.execute(ContextServiceClientImpl.java:366) [context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.client.ContextServiceClientImpl$1.execute(ContextServiceClientImpl.java:362) [context-service-sdk-extension-2.0.2-10384.jar:na]
at com.cisco.thunderhead.client.ContextServiceClientImpl$ContextCallable.call(ContextServiceClientImpl.java:1077) [context-service-sdk-extension-2.0.2-10384.jar:na]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
Caused by: java.lang.Exception: {"errorType":"databaseDriver.exception.queryExecution","errorData":"UnavailableException","errorMessage":"Not enough replica available for query at consistency LOCAL_QUORUM (2 required but only 1 alive), trackingId: a3495374-78e5-418c-82c1-abcf5e55c9b5"}
... 15 common frames omitted
And for the second:
12:33:37.484 [Thread-39] INFO com.cisco.thunderhead.RESTClient - Error on UPDATE: https://context-service.produs1.ciscoccservice.com/context/context/v1/id/a93b57c0-85cc-11e6-bc9e-7d84b7a5ad80
12:33:37.487 [Thread-39] ERROR com.cisco.thunderhead.RESTClient - Attempt to connect failed: type:timeoutRequest, data: path: /context/context/v1/id/a93b57c0-85cc-11e6-bc9e-7d84b7a5ad80, message: Request timed out: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
at com.sun.jersey.client.apache4.ApacheHttpClient4Handler.handle(ApacheHttpClient4Handler.java:187)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.put(WebResource.java:539)
at com.cisco.thunderhead.RESTClient.update(RESTClient.java:666)
at com.cisco.thunderhead.RESTClient.update(RESTClient.java:648)
at com.cisco.thunderhead.RESTClient.update(RESTClient.java:705)
at com.cisco.thunderhead.client.BaseEncryptionService.encryptAndUpdate(BaseEncryptionService.java:160)
at com.cisco.thunderhead.client.ContextServiceClientImpl$2.execute(ContextServiceClientImpl.java:422)
at com.cisco.thunderhead.client.ContextServiceClientImpl$2.execute(ContextServiceClientImpl.java:418)
at com.cisco.thunderhead.client.ContextServiceClientImpl$ContextCallable.call(ContextServiceClientImpl.java:1077)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I'm trying to understand how the Cisco DB is working to avoid such Exceptions, but I don't have success. Please, help me to understand where the bug is.
Thanks,
Rodica
11-16-2016 06:13 AM
Our custom application is running on a Windows 2012 server.
context-service-sdk-2.0.1
slf4j-1.7.21
eclipse-inst-win64
Eclipse IDE for Java Developers
Version: Neon Release (4.6.0)
Build id: 20160613-1800
JavaSetup8u101
jdk-8u101-windows-x64
jre1.8.0_51
apache-maven-3.3.9
11-16-2016 06:14 AM
I sent the log file to Vikram;
Thanks
11-16-2016 06:31 AM
Today I lanced one more time the application with the same parameters for the same DB but in LAB_MOde=true and it worked with no Exception. The TimeOut Exception occured twice in few weeks of tests and we have to figure out why and in what circonstances it happens.
Thank you,
Rodica
11-16-2016 02:19 PM
Hi Rodica,
Few things:
1) We found out that the CassandraDatabaseRestApiException exception took place while we were running a maintenance job our our cloud platform. Our platform team has taken the action item to make their maintenance window more seamless, so it doesn't impact end user actions, like this.
2) Regarding timeouts (like we discussed in your other thread Re: Multiple Update of the same customer?) - I suggest that you build logic in your custom application to retry upon timeout or failure. The reason is that you are accessing the service over public Internet (like any other SaaS). There will be factors beyond your or Cisco's control. The SDK has some retry logic built in to protect, but we've seen latency unusually high from your lab (in 4-5 seconds range per REST API call and some SDK calls are multiple REST API calls internally). If you are in an environment that is adding this latency (like a slow proxy), then your custom application should catch it and retry.
3) Over the next few weeks, we are going to add capability in the SDK that will expose latency metrics via JMX counters. You will be able to use them to measure the roundtrip time that you are seeing from your custom app. This will help you fine tune your app for your environment. We will update the doc on DevNet that will explain how to leverage them.
regards,
Vikram
11-17-2016 06:01 AM
Hi Vikram,
So for CassandraDatabaseRestApiException theoretically we should not encounter them anymore?
For TimeoutException I have a bout of code for catching the ApiException and retry one more time to create/update the needed Customer. Is that the solution that you propose? And for latency after few weeks I will be able to check it. That’s wright?
I want to be sure that I correct understood your solution. Friday on 18/11/2016 we supposed to do the initial loading again that will take more than 30 hours. So I’m trying to get fix all the exceptions to be productive in the weekend.
Thanks a lot for your help.
Rodica
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide