Move transition matrix computation outside of main function#15
Move transition matrix computation outside of main function#15mfalkiewicz wants to merge 7 commits intosensein:masterfrom
Conversation
| metric=self.metric) | ||
| return self.affinity_matrix_ | ||
| self.affinity_matrix_ = self.affinity(X) | ||
| return self.affinity_matrix_ |
There was a problem hiding this comment.
the function description should be modified if this function is not going to return the affinity matrix.
There was a problem hiding this comment.
I re-added the return of affinity_matrix_, but not as a field in self. The reason is that in the code I am currently writing I want to be able to bind results of several affinity calculations to different fields in the object.
| raise ImportError('Checks require scikit-learn, but not found') | ||
|
|
||
| ndim = L.shape[0] | ||
| if overwrite: |
There was a problem hiding this comment.
overwrite is no longer supported in the rewrite. i think this should still be used as a memory saving option.
There was a problem hiding this comment.
@mfalkiewicz - this is not supported in the refactor
| k = int(max(2, np.round(D.shape[0] * 0.01))) | ||
| eps = 2 * np.median(np.sort(D, axis=0)[k+1, :])**2 | ||
| k = int(max(2, np.round(D.shape[0] * f_neighbors))) | ||
| eps = np.median(np.sort(D, axis=0)[0:k+1, :]**2) |
There was a problem hiding this comment.
if i remember this correctly, eps is based on the k+1 th entry rather than than everything upto that point. it basically provides a scaling factor given the sorted distance.
There was a problem hiding this comment.
The approach currently implemented is more liberal than the one I propose and both of them are correct. All manifold learning approaches are based on local similarities and these should be emphasized - I think the median of all the values up to the k-th captures this property more accurately. Besides, I think that parametrization of the kernel in terms of average distances to a certain fraction of nearest neighbors is more natural, because it doesn't depend on the scale of the affinities that one uses.
| L_alpha = d_alpha[:, np.newaxis] * L_alpha | ||
|
|
||
| M = L_alpha | ||
| M = _compute_markov_matrix(L, use_sparse = use_sparse, alpha = alpha) |
There was a problem hiding this comment.
M = _compute_markov_matrix(L if overwrite else L.copy(), use_sparse = use_sparse, alpha = alpha)
There was a problem hiding this comment.
Regarding this comment and overwrite - I agree that it would be nice to have a memory saving option, but I am not sure if this refactor will allow to manipulate that. I don't know the details of the lexical scope of Python, but maybe you could enlighten me. _compute_markov_matrix creates a local binding A. If we pass an alias as an argument (L) rather than the object itself (L.copy()), is A also going to be an alias? In other words, if we pass an alias to variable that is in the lexical scope of function compute_diffusion_map as an argument to function _compute_markov_matrix, is _compute_markov_matrix going to mutate the object that the alias refers to OR create it's local copy and mutate that instead?
Sorry for slightly confusing description, but I don't know how to phrase this in a simpler way.
|
@mfalkiewicz - should #13 still be a PR? |
|
also tests are not passing - since the markov computation has changed. i would recommend creating a second |
Refactoring the kernel normalization and transition matrix computation as independent function will allow for more code re-use.