GSoC 2020 Week-3

21 June 2020 - 5 mins read time
Tags: Sympy GSoC

I spent this week working on the implementation of the ParametricIntegral class.

Modifying API of ParametricRegion

When I was writing the test cases, I realized that the API of the ParametricRegion could be improved. Instead of passing limits as a dictionary, tuples can be used. So I modified the API of the ParametricRegion class. The new API is closer to the API of the integral class and more intuitive. I made a separate PR for this change to make it easy for reviewers.

Example of the new API:

ParametricRegion( ( x+y, x*y ), (x, 0, 2), (y, 0, 2))

Handling scalar fields with no base scalars

As discussed in previous posts, we decided to not associate a coordinate system with the parametric region. Instead, we assume that the parametricregion is defined in the coordinate system of the field of which the integral is being calculated. We calculate the position vector and normal vector of the parametric region using the base scalars and vectors of the fields. This works fine for most cases. But when the field does not contain any base scalar or vector in its expression, we cannot extract the coordinate system from the field.

ParametricIntegral(150, circle)
# We cannot determine the coordinate system from the field. 
# To calculate the line integral, we need to find the derivative of the position vector.
# This is not possible until we know the base vector of the field

To handle this situation, I assign a coordinate system C to the region. This does not affect the result in any way as the result is independent of it. It just allows the existing algorithm to work in this case.

Separate class for vector and scalar fields

Francesco suggested making separate classes based on the nature of the field: vector and scalar. I am open to this idea. But I think it will be more easy and intuitive for the users if they can use the same class to calculate the integral. I do not think they are different enough from a user’s perspective to have a separate interface.

Maybe we can have a function vectorintegrate which returns the object of ParametricVectorIntegral or ParametricScalarIntegral depending on the nature of the field. This can work for other types of integrals too. Suppose we implement a class ImplicitIntegral to calculate the integral over an implicit region. The vectorintegrate function can then return an object of ImplicitIntegral object by identifying the region is defined implicitly. I think this will be great. I will have more discussion with Francesco on this aspect.

Topological sort of parameters

When evaluating double integral, the result some times depend upon the order in which the integral is evaluated. If the bounds of one parameter u depend on another parameter v, we should integrate first with respect to u and then v.

For example, consider the problem of evaluating the area of the triangle.

T = ParametricRegion((x, y), (x, 0, 2), (y, 10 - 5*x))

The area of the triangle is 10 units and should be independent of the order parameters are specified at the time of object initialization. But the double integral depends on the order of integration.

>>> integrate(1, (x, 0, 2), (y, 10 - 5*x))
20 - 10*x
>>> integrate(1, (y, 0, 10 - 5*x), (x, 0, 2))
10

So parameters must be passed to integrate in the correct order. To overcome this issue, we topologically sort the parameters. SymPy already had a function to perform topologically sort in its utilities module. I implemented a function that generates the graph and passes it to the topological_sort function. This made my work easy.

Long computation time of Integrals

Some integrals are taking too long to compute. When base scalars in the field are replaced by their parametric equivalents, the expression of the field becomes large. Further, the integrand is the dot product of the field with a vector or product of the field and the magnitude of the vector. The integrate function takes about 20-30 seconds to calculate the integral. I think this behavior is due to the expression of integrand growing structurally despite it being simple.

For example,

>>>solidsphere = ParametricRegion((r*sin(phi)*cos(theta), r*sin(phi)*sin(theta), r*cos(phi)),\
                                (r, 0, 2), (theta, 0, 2*pi), (phi, 0, pi))
>>> ParametricIntegral(C.x**2 + C.y**2, solidsphere)

In this case, the parametric field when replaced with parametersr become r**2*sin(phi)**2*sin(theta)**2 + r**2*sin(phi)**2*cos(theta)**2 although it can be easily simplified to r**2*sin(phi).

SymPy has a function called simplify. simplify attempts to apply various methods to simplify an expression. When the integrand is simplified using it before passing to integrate, the result is returned almost immediately. Francesco rightly pointed out that simplify is computationally expensive and we can try to some specific simplification. I will look into it.

Failing test cases

Some test cases are failing because of integrate function returning results in different forms. The results are mathematically equivalent but different in terms of structure. I found this strange. I do not think this has to do with hashing. I still have not figured out this problem.

Next week’s goal

Hopefully, I will complete the work on the ParamatricIntegral and get it merged. We can then start discussing about representing implicit regions.